ARTICLE

Zero-trust architecture – Checklist

Zero-trust architecture - Checklist

Zero-trust architecture for eCommerce requires: continuous identity verification (never implicit trust), micro-segmentation (isolate critical systems), least-privilege access (minimize user permissions), API security (token-based, not API keys), real-time monitoring (detect anomalies), and incident response workflows. Implementation spans 12 – 18 months for brownfield platforms and includes identity provider setup, API gateway hardening, and network isolation on AWS/GCP/Azure infrastructure.


Zero-Trust Architecture for eCommerce: A Practitioner’s Checklist

“Zero-trust” is the security buzzword that actually matters. Unlike most security terminology, zero-trust isn’t about compliance theater; it’s about architecture that survives real attacks.

The idea: Never trust anyone or anything, even inside your network. Verify every access request. Log everything. Assume breach.

For eCommerce, zero-trust means: A customer logging in, an employee accessing the database, a payment processor calling your API – all are verified in real-time, every time. No exceptions. No “trusted” internal networks where hackers can move laterally once they get in.

Bemeir has implemented zero-trust on 15+ eCommerce platforms, from Shopify Plus to custom Magento builds to serverless architectures on AWS. Here’s the real checklist – what your CTO needs to build, what your team needs to operate, and where to start.


Phase 1: Assessment & Strategy (Weeks 1 – 3)

Current State Inventory

  • [ ] Map all systems and services (list every application, database, API, third-party integration)
  • [ ] Identify data flows (where does customer data move? Where does it sit at rest?)
  • [ ] List all identities (employees, contractors, API keys, service accounts, automation users)
  • [ ] Document access patterns (who logs in to what? How often? From where?)
  • [ ] Identify trust boundaries currently in place (what systems are “trusted”? Which aren’t?)
  • [ ] List external integrations (payment processors, CDN, email, marketing, analytics)
  • [ ] Audit VPN usage (who uses it? Why? For how long?)
  • [ ] Document current logging & monitoring (what events are logged? Who reviews logs?)
  • [ ] Identify compliance requirements (GDPR, SOC 2, PCI-DSS, industry-specific?)

Why Important: You can’t build zero-trust without understanding your current architecture. Bemeir once started a zero-trust project at a platform that had “forgotten” about a legacy SSH jump box that anyone with a password could access. That one hole defeats zero-trust.

Critical Asset Identification

  • [ ] Define “crown jewels” (customer payment data, customer PII, your source code, admin dashboards)
  • [ ] Prioritize systems by risk (payment processing = high risk; blog = low risk)
  • [ ] Map which assets interact (does payment service need access to customer database? Really?)
  • [ ] Identify attack surface (which systems are exposed to internet? Which are internal-only?)
  • [ ] Document data sensitivity (public data vs. confidential vs. secret)
  • [ ] List compliance-sensitive assets (systems that need audit logs, encryption, etc.)

Organizational & Process Audit

  • [ ] Document current onboarding process (when new employee joins, how do they get access?)
  • [ ] List offboarding procedures (when someone leaves, do all their credentials actually get revoked?)
  • [ ] Check for shared accounts (do multiple people share one database password? SSH key? API key?)
  • [ ] Document change management (who can deploy code? What’s the approval process?)
  • [ ] Identify incident response procedures (if hacked at 2am, who does what?)
  • [ ] List compliance/audit obligations (annual security audit? Monthly? Real-time monitoring?)
  • [ ] Assess team capability (do you have someone who understands mTLS? OAuth2? Network segmentation?)

Honest Reality: Most eCommerce companies score poorly here. Shared database passwords, no change log, vague incident response. This is normal; zero-trust fixes it.

Business Impact Planning

  • [ ] Identify “can’t fail” systems (payment processing, order fulfillment, customer login)
  • [ ] Estimate downtime tolerance (can payment API be down for 1 minute? 1 hour?)
  • [ ] Calculate cost of breach (estimated loss of customer trust + liability + remediation)
  • [ ] Plan for false positives (zero-trust monitoring creates alerts; who investigates them?)
  • [ ] Estimate user impact (will employees notice higher friction during auth? How much is acceptable?)
  • [ ] Define success metrics (faster incident detection? Fewer breaches? Compliance pass?)

Phase 2: Identity & Access Architecture (Weeks 4 – 8)

Identity Provider Setup

  • [ ] Choose identity provider (Okta, Azure AD, Auth0, custom OAuth2 provider on AWS?)
  • [ ] Implement multi-factor authentication (MFA for all human users; SSO/OIDC for service accounts)
  • [ ] Set up identity federation (can contractors/partners authenticate without your directory?)
  • [ ] Plan for passwordless authentication (FIDO2, phone sign-in, biometric)
  • [ ] Configure session management (how long before session expires? Can users have multiple sessions?)
  • [ ] Implement device posture checking (can users only log in from company devices? Or any device?)
  • [ ] Set up real-time revocation (if someone is fired, how fast can you revoke their session?)
  • [ ] Plan for audit logging (every authentication event logged? Include success and failures?)

Bemeir Recommendation: Okta or Auth0 for most mid-market eCommerce. They’re managed, handle MFA well, and integrate with everything. If you’re AWS-only, consider Cognito + IAM.

Access Control Architecture

  • [ ] Implement role-based access control (RBAC) for humans (employees, contractors, vendors)
  • [ ] Define roles (Developer, DevOps, Product Manager, Customer Support, Finance, etc.)
  • [ ] Map permissions per role (Developer can deploy to staging but not production)
  • [ ] Implement attribute-based access control (ABAC) for fine-grained rules (access allowed only if: user is in “payments” team AND request is from office network AND time is 9am-6pm)
  • [ ] Set up just-in-time (JIT) access (temporary elevated permissions for emergency fixes, auto-revoke after 1 hour)
  • [ ] Plan for access reviews (quarterly, manually verify: does this person still need this role?)
  • [ ] Document approval workflow (how do people request access? Who approves? What’s the SLA?)

Common Mistake: Setting up RBAC but never reviewing it. Bemeir clients typically find that 30% of access permissions are outdated (person was promoted, role changed, but permissions never updated).

Service-to-Service Authentication

  • [ ] Stop using hardcoded API keys (replace with OAuth2 client credentials flow or mTLS)
  • [ ] Implement OAuth2 / OIDC for all service-to-service communication
  • [ ] Set up mutual TLS (mTLS) for critical service boundaries (payment → order service, for example)
  • [ ] Plan for token rotation (tokens expire after 1 hour; automatically refresh)
  • [ ] Implement token binding (token is bound to a specific service identity; can’t be reused elsewhere)
  • [ ] Monitor token usage (if a service starts requesting tokens at 10× normal rate, alert)
  • [ ] Plan for service discovery (if services are containerized, how do they find each other securely?)

Architecture Pattern: This is where zero-trust gets real. Old way: “Service A runs on 10.0.1.x IP range; Service B trusts all requests from that range.” Zero-trust way: “Service A requests a token from identity provider; presents token to Service B; Service B validates token every time.”

External Integration Security

  • [ ] Audit all third-party integrations (payment processors, shipping APIs, CRM, analytics)
  • [ ] Check: which integrations have persistent API keys? (Replace with OAuth2 + scoped access)
  • [ ] Plan for least-privilege API scopes (payment processor only needs to read invoices, not modify customers)
  • [ ] Set up API key rotation (rotate keys every 90 days minimum)
  • [ ] Implement webhook validation (verify webhooks come from actual third-party, not attacker)
  • [ ] Monitor third-party API usage (unusual spike in API calls? Alert)
  • [ ] Plan for vendor security (have you asked: does your payment processor have SOC 2? Do they scan for vulnerabilities?)
Integration Current Model Zero-Trust Model
Payment Processor Persistent API key in code OAuth2 + token (1-hour expiry)
Shipping API API key in environment variable OAuth2 + scoped permission (“read shipment”, not “delete shipment”)
Email Service API key in config file OAuth2 + webhook validation
Analytics Beacon pixel + API key OIDC + scoped access

Phase 3: Network & Infrastructure Hardening (Weeks 9 – 13)

Micro-Segmentation

  • [ ] Map network boundaries (separate payment service from blog? Yes)
  • [ ] Implement network policies (security groups, NACLs, Calico policies if Kubernetes)
  • [ ] Segment by sensitivity (crown jewel services are most restricted)
  • [ ] Plan for east-west traffic rules (traffic between internal services; most should be blocked by default)
  • [ ] Implement API gateway (single entry point for all external requests; validate every request)
  • [ ] Set up web application firewall (WAF) rules (block SQL injection, XSS, DDoS patterns)
  • [ ] Plan for service-to-service mTLS (payment service can only call order service if authenticated)
  • [ ] Monitor unusual traffic patterns (spike in traffic between normally isolated systems? Alert)

Practical Example:

Old way (Implicit Trust):

Zero-trust way:

Infrastructure Access Control

  • [ ] Implement bastion hosts / jump boxes with MFA (only way to SSH to servers)
  • [ ] Disable direct SSH to production servers (all access goes through bastion, is logged)
  • [ ] Use IAM roles instead of long-lived credentials (AWS EC2 instance has temporary role credentials; no SSH key needed)
  • [ ] Enable AWS Systems Manager Session Manager (access instances without SSH keys; all sessions logged)
  • [ ] Remove standing VPN access (VPN is implicit trust; replace with specific, temporary access per task)
  • [ ] Plan for database access (no direct database access from laptops; go through application only)
  • [ ] Implement database activity monitoring (log all queries; alert on unusual access patterns)

Why This Matters: A developer’s laptop gets stolen. On the old model, attacker has SSH keys, database passwords, API keys – everything. Zero-trust model: attacker has nothing useful; every action requires real-time authentication from the identity provider.

Container & Infrastructure Hardening

  • [ ] Implement image scanning (scan container images for vulnerabilities before deployment)
  • [ ] Sign container images (only run images built and signed by trusted pipeline)
  • [ ] Enable pod security policies (Kubernetes: restrict what containers can do)
  • [ ] Implement network policies in Kubernetes (pod-to-pod traffic rules)
  • [ ] Monitor container runtime behavior (detect anomalies: unexpected process, network conn, file writes)
  • [ ] Use least-privilege container security context (container runs as non-root, read-only filesystem)
  • [ ] Implement admission controllers (Kubernetes: block deployments that don’t meet security standards)
  • [ ] Plan for supply chain security (dependencies, open-source libraries – are they from trusted sources?)

Phase 4: Detection, Monitoring & Response (Weeks 14 – 18)

Logging & Data Collection

  • [ ] Enable audit logging for all systems (identity provider, databases, APIs, infrastructure)
  • [ ] Aggregate logs to central location (ELK, Splunk, CloudWatch, DataDog)
  • [ ] Log all authentication events (success and failure; include context: IP, device, location, time)
  • [ ] Log all data access (who read customer records? When? From where?)
  • [ ] Log all privileged actions (who deployed code? Who changed firewall rules? Etc.)
  • [ ] Log API calls (which service called which? What data was accessed?)
  • [ ] Enable AWS CloudTrail (if AWS-based) for all API calls
  • [ ] Plan for log retention (how long to keep logs? Compliance requirement? 90 days? 1 year?)
  • [ ] Ensure logs are immutable (once written, can’t be modified or deleted; prevents attackers from covering tracks)

Volume Reality: A mid-market eCommerce platform generates 10 – 100GB of logs per day. You need:
– Storage: S3, Glacier for long-term
– Analytics: Splunk or DataDog to query and analyze
– Retention policy: Keep 90 days hot, 1 year archived, destroy after 7 years (or comply with legal hold)

Anomaly Detection & Alerting

  • [ ] Set up SIEM (Security Information & Event Management) rules:
  • [ ] Alert: Failed login attempts (10+ in 5 minutes = potential attack)
  • [ ] Alert: Unusual geographic location (employee in NYC yesterday, Tokyo today = suspicious)
  • [ ] Alert: Privilege escalation (regular user suddenly requests admin role = review)
  • [ ] Alert: Data exfiltration pattern (service requesting 10,000× normal amount of data = investigate)
  • [ ] Alert: Off-hours access (critical system accessed at 3am = verify)
  • [ ] Implement behavior analytics (what’s normal? What’s anomalous? ML helps detect novel attacks)
  • [ ] Set up alerting thresholds (don’t alert on every anomaly; prioritize high-risk ones)
  • [ ] Plan for alert fatigue (too many alerts = people ignore them; tune carefully)
  • [ ] Document response procedures (alert fires; what do you do next?)

Tuning: Bemeir typically runs new SIEM rules in “dry run” mode for 2 weeks first. Generates tons of false positives initially. Once tuned, real alerts are actionable.

Incident Response Procedures

  • [ ] Document incident response plan (who to call? What’s the first step?)
  • [ ] Define severity levels (Severity 1 = customer data may be exposed; Severity 2 = system down but no data risk)
  • [ ] Create incident response runbooks (if X event occurs, steps 1 – 10 are: …)
  • [ ] Plan for containment (how do you stop an active attack? Revoke tokens? Isolate systems?)
  • [ ] Define forensics process (if breached, who collects evidence? Chain of custody?)
  • [ ] Plan for communication (who tells customers? Legal? Insurance? Regulators?)
  • [ ] Set up war room procedures (incident happens → team joins Slack channel → someone leads; clear roles)
  • [ ] Document post-incident review (after incident, analyze: what happened? What failed? What do we fix?)

Critical: Have this plan before an incident. Bemeir recommends tabletop exercises quarterly (“If we were breached tomorrow, who would…?”).

Continuous Security Testing

  • [ ] Implement vulnerability scanning (automated: scan systems weekly for known vulnerabilities)
  • [ ] Perform penetration testing (hired hackers, annual or bi-annual, attempt to break in; report findings)
  • [ ] Set up DAST (Dynamic Application Security Testing) (automated: test running web app for common vulns: SQLi, XSS, CSRF)
  • [ ] Implement SAST (Static Application Security Testing) (code analysis tool runs on every code commit; flags security issues)
  • [ ] Plan for supply chain security scans (check dependencies for known vulns: npm audit, Snyk, etc.)
  • [ ] Document remediation timelines (critical vuln found; how fast can you patch? 24 hours? 1 week?)
Testing Type Frequency Purpose Cost
Vulnerability Scanning Weekly automated Find known vulns $5 – 20K/year (tool)
SAST (code analysis) Per commit Catch security bugs in code $20 – 50K/year
DAST (runtime testing) Daily automated Test running app $15 – 40K/year
Penetration Testing Annual/Bi-annual Real-world attack sim $30 – 80K per engagement

Phase 5: Governance & Continuous Improvement (Ongoing)

Access Reviews & Recertification

  • [ ] Quarterly access review (all employees: do they still need their current permissions?)
  • [ ] Manager certification (manager must certify: “Yes, this person still needs this access”)
  • [ ] Contractor audit (contractors should have temporary, revoked access after project ends)
  • [ ] Service account review (API keys, service credentials – are they still needed? Rotate quarterly)
  • [ ] Privileged access review (who has admin? Do they really need it? Limit to 10% of staff)

Policy & Documentation

  • [ ] Document zero-trust architecture (diagram, decision log, rationale)
  • [ ] Create security policy (what’s acceptable? What’s prohibited?)
  • [ ] Build change management policy (how are changes approved? Who can deploy?)
  • [ ] Establish incident response policy (who’s involved? What’s the communication plan?)
  • [ ] Document compliance requirements (GDPR, SOC 2, PCI – what specifically do we need to do?)
  • [ ] Train team on policies (everyone understands expectations)

Metrics & KPIs

  • [ ] Track mean time to detect (MTTD) breach (goal: <1 hour for critical incidents)
  • [ ] Track mean time to respond (MTTR) (goal: <4 hours)
  • [ ] Monitor authentication latency (goal: <100ms even during peak load)
  • [ ] Measure false positive rate (goal: <10% of alerts are false)
  • [ ] Track access request SLA (goal: access granted within 24 hours)
  • [ ] Monitor privileged access usage (goal: <5% of staff have admin; document why)
  • [ ] Calculate security improvement (fewer incidents? Faster detection? Report to board)

Common Implementation Blockers & Solutions

Blocker Root Cause Solution
MFA breaks integrations Third-party tools don’t support MFA Use service accounts with OAuth2 instead of human user MFA
Latency increases Token validation adds overhead Implement token caching; validate once per session instead of per request
Team resists change “This is too much friction” Gradual rollout; start with non-critical systems; measure friction
Legacy system can’t do OAuth2 Old application doesn’t support modern auth Build OAuth2 adapter/shim; eventually sunset legacy system
Too many false positives SIEM rules too strict Tune rules; run in dry-run mode first; educate on what’s “normal”
Cost too high Identity provider, SIEM, security tools expensive Start with open-source (Keycloak, ELK); graduate to managed as you scale

Let us help you get started on a project with Zero-trust architecture – Checklist and leverage our partnership to your fullest advantage. Fill out the contact form below to get started.

more articles about ecommerce

Read on the latest with Shopify, Magento, eCommerce topics and more.