Skip to main content

System Architecture and Security

BitDive captures method-level execution data from running Java applications. This data includes method names, arguments, return values, SQL queries, and HTTP payloads. Because this data can contain sensitive information, the entire architecture is built around a Zero-Trust security model: every component is authenticated, every transfer is encrypted, and sensitive fields are masked before they leave your infrastructure.


Zero-Trust Principles

Zero-Trust means no component in the system is implicitly trusted, even internal services authenticate to each other.

PrincipleHow BitDive Implements It
Never trust, always verifyEvery service-to-service call requires a valid, short-lived TLS certificate managed by Vault. No shared secrets.
Least privilegeThe Java agent has read-only access to your application's execution flow. It cannot modify code, alter state, or access the file system.
Encrypt everythingTLS 1.3 in transit. AES-256 at rest. No plaintext data at any stage.
Mask at the sourcePII fields are redacted inside the agent's memory before any data leaves your JVM. The server never sees the original values.
Isolate tenantsEach organization's data is logically isolated. Cross-tenant access is architecturally impossible.
Rotate credentialsSSL certificates and encryption keys rotate on a 24-hour cycle via Vault.

What the Java Agent Collects (and What It Does Not)

Collected

  • Method metadata: Class name, method name, execution time, call sequence.
  • Method arguments and return values: The objects passed in and returned, serialized to JSON. Subject to PII masking.
  • SQL queries: Normalized query text and bind parameters.
  • HTTP requests/responses: URL, method, headers, body (subject to masking).
  • Kafka messages: Topic, key, payload (subject to masking).
  • JVM metrics: Heap usage, GC activity, thread counts.
  • Exception stack traces: Full stack trace with cause chain.

NOT Collected

  • Source code. The agent instruments bytecode at runtime but never reads or transmits .java files.
  • File system contents. No disk access beyond the agent's own temp directory.
  • Environment variables or system properties (unless explicitly configured).
  • Network traffic outside the instrumented application. The agent does not act as a proxy or packet sniffer.
  • Credentials, tokens, or secrets — these are masked by default via the Sensitive Keywords configuration.

Controlling What Gets Captured

You have fine-grained control over instrumentation scope:

  • Package filters: Instrument only your packages (e.g., com.yourcompany.*), excluding frameworks and libraries.
  • Method depth limits: Cap how deep into the call stack the agent records.
  • Sensitive Keywords: Any field matching keywords like password, token, cookie, secret, ssn, creditCard is automatically masked to *** before serialization. Configure in Advanced Settings.
  • Selective capture: Disable argument/return value capture for specific methods or classes.

Architecture Overview

The system consists of multiple microservices, each with a single responsibility:

Architecture Diagram

BitDive System Architecture Diagram - Core services, security components, and infrastructure interaction

Core Services

ServiceResponsibilitySecurity Boundary
Java AgentInstruments the target app, captures traces, masks PII, encrypts and signs payloads before transmissionRuns inside your JVM. Only outbound connection is to File Acceptor via TLS.
File AcceptorReceives encrypted payloads, verifies digital signatures, stores in MinIOPublic-facing endpoint. Rejects any payload with an invalid signature.
Flink LoadDecrypts payloads, parses trace data, writes to PostgreSQLInternal only. Retrieves decryption keys from Vault on demand.
Monitoring APIServes the dashboard UI, exposes MCP endpoints for AI agentsAuthenticated via Keycloak (OIDC/SAML). All queries go through RBAC.
FrontendReact dashboard for trace exploration, test creation, configurationCommunicates only with Monitoring API over TLS.

Infrastructure Services

ServiceRole
VaultManages all secrets: SSL certificates, encryption keys, signing keys. 24-hour automatic rotation.
PostgreSQLPrimary data store. Encrypted at rest (AES-256-CBC). SSL-only connections.
MinIOObject storage for encrypted trace archives. Server-side encryption enabled.
KeycloakIdentity provider. Supports OIDC, SAML 2.0, SSO, MFA. Manages RBAC policies.

Encryption: In Transit and At Rest

In Transit

All communication uses TLS 1.3 with Perfect Forward Secrecy:

  • Agent → File Acceptor: Encrypted payload over TLS. Additionally, the payload itself is encrypted with AES/GCM/NoPadding and signed with SHA256withECDSA before transmission (double protection).
  • Service → Service: Mutual TLS (mTLS) between all internal microservices. Certificates issued by Vault.
  • Dashboard → API: Standard TLS 1.3. Authentication tokens issued by Keycloak.
  • MCP (AI Agent) → API: TLS 1.3 with API key authentication.

At Rest

  • MinIO: Trace archives stored with AES-256 server-side encryption.
  • PostgreSQL: Data encrypted with AES-256-CBC. SSL-only connections enforced.
  • Vault: Secrets sealed with Shamir's Secret Sharing. Auto-unseal via cloud KMS or manual keys.

Key Rotation

All encryption keys and SSL certificates rotate on a 24-hour cycle:

  1. Vault issues new certificates and keys automatically.
  2. Services pick up new credentials on the next heartbeat (< 5 minutes).
  3. Old keys are retained temporarily for decrypting in-flight data, then destroyed.

PII Masking and Data Privacy

BitDive masks sensitive data at the source — inside the Java agent, before any data leaves your application's JVM.

How It Works

  1. Configuration: You define Sensitive Keywords in the Configuration panel (e.g., password, token, cookie, secret, ssn, creditCard).
  2. Agent-side masking: When the agent serializes method arguments or return values, any field name matching a keyword is replaced with ***.
  3. Server never sees originals: The masked data is what gets encrypted and transmitted. Even if someone decrypted the payload, the sensitive values are already gone.

Default Masked Fields

Out of the box, BitDive masks fields matching: password, token, secret, cookie, authorization, credential, apiKey.

GDPR Compliance

  • BitDive acts as a Data Processor under your instructions (Data Controller).
  • Data Subject rights (access, rectification, erasure) are supported via the dashboard.
  • A signed Data Processing Agreement (DPA) is available on request.
  • See the full Privacy Policy for details.

Deployment Models

BitDive supports three deployment models, each with different security characteristics:

SaaS (Cloud-Hosted)

  • Infrastructure: Hosted on major cloud providers (EU and UK regions).
  • Data residency: EU by default. Custom regions available on Enterprise plans.
  • Network: Agent connects outbound to api.bitdive.io over TLS 1.3 (port 443). No inbound connections required.
  • Best for: Teams that want zero infrastructure overhead.

On-Premise

  • Infrastructure: You run the full BitDive stack (Docker Compose) inside your network.
  • Data residency: All data stays within your infrastructure. Nothing leaves your network.
  • Network: Agent connects to your local BitDive instance. No external calls.
  • Best for: Organizations with data sovereignty requirements.

Air-Gapped

  • Infrastructure: Fully isolated deployment with no internet access.
  • License: Offline license validation. No phone-home telemetry.
  • Network: Zero external connections. Agent, backend, and dashboard all run within the isolated network.
  • Best for: Defense, finance, healthcare, and regulated industries.

Secure File Transfer Process

Process Diagram

Secure File Transfer Workflow - Java agent archive preparation and encrypted transmission to BitDive backend

Steps

  1. Agent prepares archive: Captured trace data is serialized using BitDive's high-performance binary format (not JSON — smaller, faster, no accidental PII in string form).
  2. Encryption: The archive is encrypted with AES/GCM/NoPadding using a per-session key.
  3. Signing: The encrypted archive is signed with SHA256withECDSA using the agent's private key.
  4. Transmission: The signed, encrypted payload is sent to File Acceptor over TLS 1.3.
  5. Verification: File Acceptor verifies the signature against the agent's registered public key. Invalid signatures are rejected immediately.
  6. Storage: The verified payload is stored in MinIO with server-side encryption. The original archive is never stored in plaintext.

Key Management

  • The agent retrieves its public/private key pair from the BitDive backend on startup.
  • Keys rotate every 24 hours. The agent caches the current key locally for high-throughput scenarios.
  • Old keys are valid for 48 hours (grace period for in-flight data), then invalidated.

File Processing Pipeline

Process Diagram

BitDive File Processing Pipeline - Decryption, analysis, and secure storage of Java execution data

Flow

  1. Retrieval: Flink Load pulls encrypted archives from MinIO.
  2. Decryption: Decryption keys are fetched from Vault on demand (never cached to disk).
  3. Parsing: The binary archive is parsed into structured trace data: methods, SQL, HTTP, Kafka.
  4. Storage: Parsed data is written to PostgreSQL over an SSL connection with AES-CBC encryption at the column level for sensitive fields.
  5. Cleanup: The encrypted archive in MinIO is retained per the retention policy, then automatically purged.

Access Control and Authentication

Keycloak Integration

BitDive uses Keycloak for identity management:

  • Protocols: OIDC and SAML 2.0.
  • SSO: Connect your corporate identity provider (Okta, Azure AD, Google Workspace).
  • MFA: Multi-factor authentication supported for all plans.
  • RBAC: Role-based access control with predefined roles (Admin, Developer, Viewer) and custom roles on Enterprise plans.

API and MCP Authentication

  • Dashboard API: Bearer tokens issued by Keycloak. Short-lived (15 minutes), automatically refreshed.
  • MCP endpoints: API key authentication with per-key scope restrictions.
  • Audit trail: Every API call is logged with user, timestamp, action, and affected resource (Enterprise plans).

Frontend Security

Process Diagram

BitDive Frontend Operations - Interaction between the UI dashboard, monitoring API, and security layers

  • All dashboard communication is over TLS.
  • Authentication tokens are stored in httpOnly cookies (not localStorage).
  • CSP headers prevent XSS and injection attacks.
  • For on-premise deployments, distribute the SSL certificate to browsers to avoid trust warnings.

Client Library Integration

The BitDive agent is a standard Maven/Gradle dependency:

<dependency>
<groupId>io.bitdive</groupId>
<artifactId>bitdive-producer-spring-3</artifactId>
<version>0.0.15</version>
</dependency>
  • No production code changes required. The agent uses Java Instrumentation API (bytecode-level, not source-level).
  • Performance overhead: 0.5–5% CPU depending on load and instrumentation depth.
  • Kill switch: The agent can be disabled instantly via the dashboard without redeploying your application.

Data Retention and Deletion

Data TypeDefault RetentionEnterprise
Detailed execution traces14 days (Starter) / 90 days (Pro)Custom policy
Aggregated performance metrics1 yearCustom policy
Encrypted archives (MinIO)Same as trace retentionCustom policy
Audit logs90 days (Pro)SIEM export, custom retention
  • Deletion is automatic at the end of the retention window.
  • Manual deletion is available via the dashboard (Data Subject erasure requests).
  • On account deletion, all data is purged within 30 days.

System Requirements

ComponentRequirement
Java AgentJDK 8+ (supports 8, 11, 17, 21)
Docker (on-premise)Docker 20.10+ and Docker Compose v2
PostgreSQL14+ (provided in Docker Compose)
MinIOLatest stable (provided in Docker Compose)
NetworkOutbound HTTPS (port 443) to api.bitdive.io for SaaS. No inbound ports required.
Storage50 GB minimum for MinIO (scales with trace volume)

Data Privacy Capabilities

The architecture provides the building blocks that organizations need to meet their compliance requirements:

CapabilityWhat It Gives You
PII masking at sourceSensitive fields are redacted inside the JVM before data leaves your infrastructure. The server never sees originals.
On-premise / air-gapped deploymentAll data stays within your network. Zero external connections.
Encryption everywhereTLS 1.3 in transit, AES-256 at rest, 24-hour key rotation via Vault.
RBAC and SSOKeycloak-based access control with OIDC/SAML. Custom roles on Enterprise plans.
Data retention controlsConfigurable retention windows with automatic purge. Manual deletion via dashboard.
GDPR data subject rightsAccess, rectification, and erasure supported via the dashboard. DPA available on request.
Audit loggingEvery API action logged with user, timestamp, and affected resource (Enterprise plans).

For privacy questions or to request a Data Processing Agreement, contact privacy@bitdive.io.


Troubleshooting

Browser Trust Issues (On-Premise)

  • Distribute the BitDive SSL certificate to all client machines.
  • Verify certificate validity with openssl s_client -connect your-bitdive:443.
  • Restart the browser after installing the certificate.

Agent Connection Issues

  • Verify outbound HTTPS access to api.bitdive.io (SaaS) or your local BitDive instance.
  • Check that the agent key has not expired (24-hour rotation).
  • Review agent logs for SSL handshake failed or signature verification failed errors.

Data Not Appearing

  • Confirm the agent is attached: look for BitDive Agent initialized in application startup logs.
  • Check that your package filters include the classes you expect to instrument.
  • Verify that the retention policy has not already purged older data.