System Architecture and Security
BitDive captures method-level execution data from running Java applications. This data includes method names, arguments, return values, SQL queries, and HTTP payloads. Because this data can contain sensitive information, the entire architecture is built around a Zero-Trust security model: every component is authenticated, every transfer is encrypted, and sensitive fields are masked before they leave your infrastructure.
Zero-Trust Principles
Zero-Trust means no component in the system is implicitly trusted, even internal services authenticate to each other.
| Principle | How BitDive Implements It |
|---|---|
| Never trust, always verify | Every service-to-service call requires a valid, short-lived TLS certificate managed by Vault. No shared secrets. |
| Least privilege | The Java agent has read-only access to your application's execution flow. It cannot modify code, alter state, or access the file system. |
| Encrypt everything | TLS 1.3 in transit. AES-256 at rest. No plaintext data at any stage. |
| Mask at the source | PII fields are redacted inside the agent's memory before any data leaves your JVM. The server never sees the original values. |
| Isolate tenants | Each organization's data is logically isolated. Cross-tenant access is architecturally impossible. |
| Rotate credentials | SSL certificates and encryption keys rotate on a 24-hour cycle via Vault. |
What the Java Agent Collects (and What It Does Not)
Collected
- Method metadata: Class name, method name, execution time, call sequence.
- Method arguments and return values: The objects passed in and returned, serialized to JSON. Subject to PII masking.
- SQL queries: Normalized query text and bind parameters.
- HTTP requests/responses: URL, method, headers, body (subject to masking).
- Kafka messages: Topic, key, payload (subject to masking).
- JVM metrics: Heap usage, GC activity, thread counts.
- Exception stack traces: Full stack trace with cause chain.
NOT Collected
- Source code. The agent instruments bytecode at runtime but never reads or transmits
.javafiles. - File system contents. No disk access beyond the agent's own temp directory.
- Environment variables or system properties (unless explicitly configured).
- Network traffic outside the instrumented application. The agent does not act as a proxy or packet sniffer.
- Credentials, tokens, or secrets — these are masked by default via the Sensitive Keywords configuration.
Controlling What Gets Captured
You have fine-grained control over instrumentation scope:
- Package filters: Instrument only your packages (e.g.,
com.yourcompany.*), excluding frameworks and libraries. - Method depth limits: Cap how deep into the call stack the agent records.
- Sensitive Keywords: Any field matching keywords like
password,token,cookie,secret,ssn,creditCardis automatically masked to***before serialization. Configure in Advanced Settings. - Selective capture: Disable argument/return value capture for specific methods or classes.
Architecture Overview
The system consists of multiple microservices, each with a single responsibility:
Architecture Diagram

Core Services
| Service | Responsibility | Security Boundary |
|---|---|---|
| Java Agent | Instruments the target app, captures traces, masks PII, encrypts and signs payloads before transmission | Runs inside your JVM. Only outbound connection is to File Acceptor via TLS. |
| File Acceptor | Receives encrypted payloads, verifies digital signatures, stores in MinIO | Public-facing endpoint. Rejects any payload with an invalid signature. |
| Flink Load | Decrypts payloads, parses trace data, writes to PostgreSQL | Internal only. Retrieves decryption keys from Vault on demand. |
| Monitoring API | Serves the dashboard UI, exposes MCP endpoints for AI agents | Authenticated via Keycloak (OIDC/SAML). All queries go through RBAC. |
| Frontend | React dashboard for trace exploration, test creation, configuration | Communicates only with Monitoring API over TLS. |
Infrastructure Services
| Service | Role |
|---|---|
| Vault | Manages all secrets: SSL certificates, encryption keys, signing keys. 24-hour automatic rotation. |
| PostgreSQL | Primary data store. Encrypted at rest (AES-256-CBC). SSL-only connections. |
| MinIO | Object storage for encrypted trace archives. Server-side encryption enabled. |
| Keycloak | Identity provider. Supports OIDC, SAML 2.0, SSO, MFA. Manages RBAC policies. |
Encryption: In Transit and At Rest
In Transit
All communication uses TLS 1.3 with Perfect Forward Secrecy:
- Agent → File Acceptor: Encrypted payload over TLS. Additionally, the payload itself is encrypted with AES/GCM/NoPadding and signed with SHA256withECDSA before transmission (double protection).
- Service → Service: Mutual TLS (mTLS) between all internal microservices. Certificates issued by Vault.
- Dashboard → API: Standard TLS 1.3. Authentication tokens issued by Keycloak.
- MCP (AI Agent) → API: TLS 1.3 with API key authentication.
At Rest
- MinIO: Trace archives stored with AES-256 server-side encryption.
- PostgreSQL: Data encrypted with AES-256-CBC. SSL-only connections enforced.
- Vault: Secrets sealed with Shamir's Secret Sharing. Auto-unseal via cloud KMS or manual keys.
Key Rotation
All encryption keys and SSL certificates rotate on a 24-hour cycle:
- Vault issues new certificates and keys automatically.
- Services pick up new credentials on the next heartbeat (< 5 minutes).
- Old keys are retained temporarily for decrypting in-flight data, then destroyed.
PII Masking and Data Privacy
BitDive masks sensitive data at the source — inside the Java agent, before any data leaves your application's JVM.
How It Works
- Configuration: You define Sensitive Keywords in the Configuration panel (e.g.,
password,token,cookie,secret,ssn,creditCard). - Agent-side masking: When the agent serializes method arguments or return values, any field name matching a keyword is replaced with
***. - Server never sees originals: The masked data is what gets encrypted and transmitted. Even if someone decrypted the payload, the sensitive values are already gone.
Default Masked Fields
Out of the box, BitDive masks fields matching: password, token, secret, cookie, authorization, credential, apiKey.
GDPR Compliance
- BitDive acts as a Data Processor under your instructions (Data Controller).
- Data Subject rights (access, rectification, erasure) are supported via the dashboard.
- A signed Data Processing Agreement (DPA) is available on request.
- See the full Privacy Policy for details.
Deployment Models
BitDive supports three deployment models, each with different security characteristics:
SaaS (Cloud-Hosted)
- Infrastructure: Hosted on major cloud providers (EU and UK regions).
- Data residency: EU by default. Custom regions available on Enterprise plans.
- Network: Agent connects outbound to
api.bitdive.ioover TLS 1.3 (port 443). No inbound connections required. - Best for: Teams that want zero infrastructure overhead.
On-Premise
- Infrastructure: You run the full BitDive stack (Docker Compose) inside your network.
- Data residency: All data stays within your infrastructure. Nothing leaves your network.
- Network: Agent connects to your local BitDive instance. No external calls.
- Best for: Organizations with data sovereignty requirements.
Air-Gapped
- Infrastructure: Fully isolated deployment with no internet access.
- License: Offline license validation. No phone-home telemetry.
- Network: Zero external connections. Agent, backend, and dashboard all run within the isolated network.
- Best for: Defense, finance, healthcare, and regulated industries.
Secure File Transfer Process
Process Diagram

Steps
- Agent prepares archive: Captured trace data is serialized using BitDive's high-performance binary format (not JSON — smaller, faster, no accidental PII in string form).
- Encryption: The archive is encrypted with AES/GCM/NoPadding using a per-session key.
- Signing: The encrypted archive is signed with SHA256withECDSA using the agent's private key.
- Transmission: The signed, encrypted payload is sent to File Acceptor over TLS 1.3.
- Verification: File Acceptor verifies the signature against the agent's registered public key. Invalid signatures are rejected immediately.
- Storage: The verified payload is stored in MinIO with server-side encryption. The original archive is never stored in plaintext.
Key Management
- The agent retrieves its public/private key pair from the BitDive backend on startup.
- Keys rotate every 24 hours. The agent caches the current key locally for high-throughput scenarios.
- Old keys are valid for 48 hours (grace period for in-flight data), then invalidated.
File Processing Pipeline
Process Diagram

Flow
- Retrieval: Flink Load pulls encrypted archives from MinIO.
- Decryption: Decryption keys are fetched from Vault on demand (never cached to disk).
- Parsing: The binary archive is parsed into structured trace data: methods, SQL, HTTP, Kafka.
- Storage: Parsed data is written to PostgreSQL over an SSL connection with AES-CBC encryption at the column level for sensitive fields.
- Cleanup: The encrypted archive in MinIO is retained per the retention policy, then automatically purged.
Access Control and Authentication
Keycloak Integration
BitDive uses Keycloak for identity management:
- Protocols: OIDC and SAML 2.0.
- SSO: Connect your corporate identity provider (Okta, Azure AD, Google Workspace).
- MFA: Multi-factor authentication supported for all plans.
- RBAC: Role-based access control with predefined roles (Admin, Developer, Viewer) and custom roles on Enterprise plans.
API and MCP Authentication
- Dashboard API: Bearer tokens issued by Keycloak. Short-lived (15 minutes), automatically refreshed.
- MCP endpoints: API key authentication with per-key scope restrictions.
- Audit trail: Every API call is logged with user, timestamp, action, and affected resource (Enterprise plans).
Frontend Security
Process Diagram

- All dashboard communication is over TLS.
- Authentication tokens are stored in httpOnly cookies (not localStorage).
- CSP headers prevent XSS and injection attacks.
- For on-premise deployments, distribute the SSL certificate to browsers to avoid trust warnings.
Client Library Integration
The BitDive agent is a standard Maven/Gradle dependency:
<dependency>
<groupId>io.bitdive</groupId>
<artifactId>bitdive-producer-spring-3</artifactId>
<version>0.0.15</version>
</dependency>
- No production code changes required. The agent uses Java Instrumentation API (bytecode-level, not source-level).
- Performance overhead: 0.5–5% CPU depending on load and instrumentation depth.
- Kill switch: The agent can be disabled instantly via the dashboard without redeploying your application.
Data Retention and Deletion
| Data Type | Default Retention | Enterprise |
|---|---|---|
| Detailed execution traces | 14 days (Starter) / 90 days (Pro) | Custom policy |
| Aggregated performance metrics | 1 year | Custom policy |
| Encrypted archives (MinIO) | Same as trace retention | Custom policy |
| Audit logs | 90 days (Pro) | SIEM export, custom retention |
- Deletion is automatic at the end of the retention window.
- Manual deletion is available via the dashboard (Data Subject erasure requests).
- On account deletion, all data is purged within 30 days.
System Requirements
| Component | Requirement |
|---|---|
| Java Agent | JDK 8+ (supports 8, 11, 17, 21) |
| Docker (on-premise) | Docker 20.10+ and Docker Compose v2 |
| PostgreSQL | 14+ (provided in Docker Compose) |
| MinIO | Latest stable (provided in Docker Compose) |
| Network | Outbound HTTPS (port 443) to api.bitdive.io for SaaS. No inbound ports required. |
| Storage | 50 GB minimum for MinIO (scales with trace volume) |
Data Privacy Capabilities
The architecture provides the building blocks that organizations need to meet their compliance requirements:
| Capability | What It Gives You |
|---|---|
| PII masking at source | Sensitive fields are redacted inside the JVM before data leaves your infrastructure. The server never sees originals. |
| On-premise / air-gapped deployment | All data stays within your network. Zero external connections. |
| Encryption everywhere | TLS 1.3 in transit, AES-256 at rest, 24-hour key rotation via Vault. |
| RBAC and SSO | Keycloak-based access control with OIDC/SAML. Custom roles on Enterprise plans. |
| Data retention controls | Configurable retention windows with automatic purge. Manual deletion via dashboard. |
| GDPR data subject rights | Access, rectification, and erasure supported via the dashboard. DPA available on request. |
| Audit logging | Every API action logged with user, timestamp, and affected resource (Enterprise plans). |
For privacy questions or to request a Data Processing Agreement, contact privacy@bitdive.io.
Troubleshooting
Browser Trust Issues (On-Premise)
- Distribute the BitDive SSL certificate to all client machines.
- Verify certificate validity with
openssl s_client -connect your-bitdive:443. - Restart the browser after installing the certificate.
Agent Connection Issues
- Verify outbound HTTPS access to
api.bitdive.io(SaaS) or your local BitDive instance. - Check that the agent key has not expired (24-hour rotation).
- Review agent logs for
SSL handshake failedorsignature verification failederrors.
Data Not Appearing
- Confirm the agent is attached: look for
BitDive Agent initializedin application startup logs. - Check that your package filters include the classes you expect to instrument.
- Verify that the retention policy has not already purged older data.