What you can fire on
BitM leaves signals at four layers. None of them on their own is bulletproof — but combined into a weighted score, they catch the attack with high confidence and very low false-positive rate. The four layers, in fidelity order:
- Network layer — the noVNC WebSocket carrying RFB protocol frames. Highest fidelity. RFB on a WebSocket is essentially never legitimate.
- Browser/client layer — canvas-rendered login pages with no real password input, frame mismatches, noVNC library globals. Detectable from a browser extension or a managed-browser policy.
- Behavioral layer — input lag, cursor lag, DPI mismatches caused by the remote container. Useful as a secondary signal; not actionable on its own.
- OAuth/identity layer — refresh-token issuance from new ASNs, broad-scope grants to Desktop-app clients, Gmail batch-API access patterns that real clients do not produce. This is the post-compromise signal that catches the attacker even if you missed the live capture.
We tested all of these in our research lab against a controlled BitM setup. The observations below are ground truth, not theory.
Network-layer signals
RFB protocol handshake on a WebSocket
The strongest signal available. noVNC opens a WebSocket from the victim's browser to the attacker's relay. The first server-to-client frame on that socket is the literal byte sequence RFB 003.xxx\n — the VNC protocol version negotiation. No legitimate web application sends this byte sequence in a WebSocket frame. If your gateway does deep packet inspection on WebSocket payloads after the upgrade handshake, fire on the first 16 bytes after the upgrade.
Combined with JA4 fingerprint of the WebSocket TLS, this gives you a near-zero-false-positive detection. The cost is that DPI on encrypted WebSocket traffic requires SSL inspection, which most consumer-grade environments do not have.
For corporate environments with managed browsers and SSL inspection, this is the headline detection. For everything else, you fall back to the browser-side signals below.
WebSocket destination during sign-in flows
The victim's browser opens a WebSocket that does not go to Google. During what looks like a Google sign-in flow, all traffic should plausibly be going to accounts.google.com, gstatic.com, and Google ASN ranges. A WebSocket to a single VPS IP on a port like 6080 or 8443, against a freshly-registered domain, is the BitM relay.
Tune for the legitimate cases:
- Customer support widgets that open WebSockets to vendor domains during page interactions
- Real-time collaboration tools (when the user happens to have one open during a sign-in)
- Some MFA / push-notification implementations that use WebSockets for the push channel
The signal is not "WebSocket to non-Google host" alone. It is "WebSocket to a freshly-registered single-VPS domain and the page presents itself as a Google login."
DNS-and-IP impedance mismatch
The lure domain resolves to a single VPS IP. The rendered page implies the user is interacting with Google. NetFlow shows zero traffic to Google ASN from the victim's machine during the supposed Google sign-in. This is a pivot signal — useful when you are already investigating a suspicious sign-in event and want to confirm whether the user actually reached Google or not.
Browser/client-side signals
These are the signals our BitM Shield extension fires on (covered in detail in the mitigations post). Useful for individual users without enterprise log infrastructure.
Canvas-only login forms
A page that asks for a password but has zero <input type="password"> elements in the real DOM is essentially never legitimate. The BitM relay renders the entire login UI to a <canvas> element via noVNC. Real Google login pages have real DOM inputs.
Detection logic: count <input type="password"> elements in the DOM. If the page has visual cues consistent with a login form (URL contains login, signin, auth; title contains "Sign in"; visible text like "Password") but zero password inputs, the form is being rendered to canvas. Fire.
noVNC library globals
noVNC's JavaScript library exports stable globals across versions: window.RFB, window.WebUtil, the #noVNC_canvas element, the .noVNC_container class. Script imports of rfb.js or websockify are present whenever a noVNC client is loaded. Never present on a real Google login page.
Polling these globals from a content script catches the BitM page even before the WebSocket handshake completes — the noVNC client loads its library in the page's JavaScript context before opening any sockets.
Origin mismatch with IdP visual cues
The page mimics Google's visual style — logo, color palette, font choices, layout — but location.hostname is something other than accounts.google.com. This is the pattern most lure pages use even before the noVNC stream loads, so it fires earlier in the page lifecycle.
False positives: legitimate co-branded Google sign-in (Workspace SSO with custom branding), some help-center articles that include screenshots of the Google login. Treat as a contributing signal in a weighted score, not as a primary detection on its own.
Cross-frame origin mismatch
The top-level URL is the lure domain. The "inner" content (or the canvas render target) presents content from accounts.google.com — but no actual frame to accounts.google.com is loaded. The Google page is pixels, not DOM. Useful as a confirmation signal: the page claims to be Google but document.querySelectorAll('iframe') returns zero frames pointing at any Google origin.
Behavioral signals
These are weaker on their own but useful as secondary confirmations or as user-facing indicators ("trust your senses, this feels off").
- Input latency. Round-trip to the remote container adds 50–300ms even on fast links. Compare keystroke→character-render latency against a baseline for the user's network. Anything consistently above ~80ms on a sign-in form is suspicious.
- Cursor lag. Remote cursor draw lags local mouse position. Sniffable client-side via
mousemoveevent timestamps versus paint timestamps. Real Google pages render the cursor instantly because the cursor is just the browser's native cursor; noVNC streams render the cursor as part of the pixel feed and lag behind. - Resolution / DPI mismatches. noVNC streams a fixed-resolution display. The device-pixel ratio of the rendered "browser" stays constant while the outer page reports the victim's actual DPR. Window resize triggers a noVNC re-render rather than a fluid CSS reflow.
- Audio channel anomalies. noVNC supports audio. Most legitimate login pages do not open audio. An audio WebSocket frame on a sign-in page is a strong tell.
These are useful for end-user awareness training — covered in the IR runbook — and for enriching a primary detection with corroborating evidence.
OAuth-side signals (post-compromise)
This is the highest-value section for SOC and IR teams. If you missed the live capture (most defenders will, because they do not have DPI on encrypted WebSockets), the OAuth grant is your second chance to catch the attack. It also catches the long tail of attacker access that follows — every time the attacker uses the refresh token, it leaves a Workspace audit log entry.
New OAuth grant immediately after a suspicious sign-in
The pattern: a user signs in (legitimate-looking auth event), and within seconds an OAuth grant is issued for a Desktop-app client requesting broad mail scopes. The attacker registered the OAuth client themselves; it has a generic name and no association with any of your tenant's known apps.
Workspace audit log query (Admin SDK Reports API):
applicationName = "token"chr(10)event_name = "authorize"chr(10)client_type = "Other"chr(10)scopes contains "mail.google.com"Filter further on client_id not in your ApprovedAppIds watchlist. Fire on first occurrence per user per day. False-positive rate in a tuned tenant: low — most users authorize 0 to 1 new OAuth apps per quarter, so a fresh authorization is itself notable.
Grant scopes include mail.google.com for a non-mail app
The full-mailbox scope is mail.google.com. Almost no legitimate consumer-grade app needs this — most mail clients use the granular scopes (gmail.readonly, gmail.send, gmail.compose). A Desktop-app OAuth client requesting mail.google.com is the canonical attacker fingerprint.
Tune by maintaining an allowlist of legitimate apps that genuinely need the full scope (your IT-approved mail client, your CRM email integration). Anything else is suspicious.
Refresh-token issuance from a new ASN
The token issuance happens from the attacker's container, which lives on a hosting ASN — DigitalOcean, OVH, Hetzner, AWS, Azure, Cloudflare. A user whose normal sign-in geography is a residential ISP suddenly receiving a refresh token from a hosting ASN is the BitM signature.
Workspace sign-in logs include ip_address. Enrich with ASN lookup. Maintain a list of hosting ASNs and fire on any token-issuance event from one of them.
Token-exchange flow without browser-history correlation
EDR or browser-history correlation: a refresh token gets issued to a user, but the user's browser history shows no visit to accounts.google.com during the same time window. The auth happened inside the attacker's browser, not the victim's, so the victim's browser has no history of it. Useful where you have EDR with browser-history visibility (CrowdStrike, SentinelOne, etc.).
Lab observations — what we documented from a live capture
These are direct findings from replicating the BitM technique against a controlled test account in our research lab on 2026-05-02. They add detail that desk research alone cannot.
The Gmail batch endpoint is a detection signal
The framework uses POST https://www.googleapis.com/batch/gmail/v1 to fetch up to 100 messages' metadata in a single request. Legitimate Gmail clients (the web app, mobile apps) do not use this endpoint in the same form — the web app uses internal gRPC/protobuf protocols, the mobile apps use IMAP/SMTP. Direct REST calls to the batch endpoint, from a Desktop-app OAuth client, with a python-requests-style user-agent, are the attacker's tooling.
Workspace audit-log signal:
- Actor: the Desktop-app OAuth client the attacker registered, identified by the
client_idrecorded at grant time - Method: direct Gmail API REST calls via the batch endpoint
- Timing: batch calls arrive irregularly, triggered by operator UI interactions, rather than continuously as a real mail client syncs
- User-agent:
python-requests/X.X.Xor similar library UA, not a Google-app or browser UA
Fire on first batch-endpoint access per user per day from a non-allowlisted client. The threshold catches the attacker's first mailbox sweep, which is typically within minutes of grant issuance.
The 29-cookie capture and what each cookie does
In our E2E test, 29 cookies were captured from the authenticated Gmail session. The ones that matter for defenders:
SID,HSID,SSID— primary session identifiers. Replay across IP triggers Google's impossible-travel and anomalous-session heuristics on enterprise accounts. On consumer accounts these heuristics are weaker and replay often succeeds.__Secure-3PSID— the SameSite-protected version, harder to replay across origins. Indicator of scope: if the attacker captured this one, they got the full cookie jar, not just the surface-level session ID.OSID— the OAuth session identifier. Tied to the OAuth grant created in the same session window. Pivot signal during IR — if you find an OSID associated with a grant you do not recognize, the grant is the attacker's.NID— preference and personalization. Normally stable across sessions; a sudden change indicates account access from a new browser environment.
IP-binding gap: Google's 2SV with push notifications does not bind session cookies to IP on all consumer account types. A captured cookie jar replayed from a fresh VPS IP succeeds without re-challenge. Workspace accounts with Conditional Access or BeyondCorp device trust block the replay — they are materially more resistant. If you are deciding between consumer Gmail and Workspace for sensitive accounts, this is a real differentiator.
Container infrastructure fingerprint (for IR)
The BitM Chrome containers as we built them leave a consistent process and network fingerprint. Useful for incident response if you get access to the attacker's VPS:
- Chrome launches with
--no-sandbox --disable-gpu --remote-debugging-port=9222 - Container image typically carries a name like
bitm-victim:lateston a Docker network likebitm-net - supervisord manages the process tree:
Xvfb + openbox + Chrome + socat - CDP ports 9222/9223 (visible Chrome) and 9224/9225 (headless OAuth Chrome) listening on localhost; socat proxies to host-mapped ports in the 10001+ range
- Container user-agent: a single Chrome version baked into the image, identical across all victims captured by the same operator
Network-level visibility into the attacker's infrastructure is unlikely for most defenders. But these markers are forensically useful if law enforcement seizes the VPS — the process tree and port pattern confirm BitM deployment versus other phishing toolkits.
Putting it together — a weighted detection model
No single signal above is sufficient. Combine them into a weighted score, fire on the threshold:
- RFB handshake on a WebSocket: +100 (instant trigger)
- noVNC library globals on the page: +90 (near-instant trigger)
- New OAuth grant for
mail.google.comfrom non-allowlisted client: +80 - Origin mismatch with Google visual cues: +55
- Canvas-only login (password requested, no
<input type=password>in DOM): +50 - WebSocket to non-IdP host during a login context: +40
- BitM-style lure path (
/f/<slug>on freshly-registered single-VPS domain): +20 - Behavioral: input lag >80ms, cursor lag, DPI mismatch: +10 each
Thresholds: 40 for amber/CAUTION, 80+ for red/DANGER (block or alert immediately). This is the same scoring model BitM Shield uses (see the mitigations post for the extension architecture).
Read the mitigations post next — what to deploy to prevent the capture in the first place, including the BitM Shield extension we built and verified in our research lab.