
Most enterprise eCommerce agency selections happen on the strength of a sales deck and a polished case study. The technology leaders who later regret those selections all describe the same pattern: the deck looked great, the references were warm, the contract got signed, and then the actual delivery team turned out to know less about the platform than the architects who pitched the work. The disconnect between what an agency sells and what an agency builds is the single largest source of platform expertise mismatches in commerce engagements.
This guide is for the CIO, CTO, or senior IT buyer who has to make this decision and does not want to be wrong about it. It walks through how to separate real platform expertise from sales theater, what technical due diligence actually catches, the certifications that matter and the ones that do not, and the reference questions that produce honest answers.
The Difference Between Selling and Building
The pattern to watch for is whether the people you are talking to during the sales process are the same people you will be working with during delivery. At smaller agencies these are often the same humans. At larger agencies they almost always are not, and the depth gap between the two groups can be enormous.
The single most useful sales-process intervention is to insist on meeting the proposed delivery team before signing. Not the named architects on the proposal, the actual engineers and project lead who will be doing the work for the first three months. Ask each of them to walk you through a specific technical decision they made on a recent project, and to explain the tradeoffs they considered. The depth of that conversation is the depth you will get during delivery. If the agency resists this conversation, that resistance is your answer.
Technical Due Diligence That Actually Catches Things
The technical due diligence questions that surface real platform expertise are the ones that require specific, platform-native answers. Generic answers indicate generic skill. The questions below are organized by the platforms most relevant to enterprise mid-market commerce.
For Magento and Adobe Commerce, ask how the agency handles plugin versus preference decisions in custom modules. Ask how they would architect a custom checkout step that needs to call an external service for fraud screening. Ask how they manage Composer dependencies across their client base, and how they handle Adobe security patches. Ask whether they run their builds with PHPStan or a similar static analysis tool, at what level, and how they handle violations. The answers should be specific, the constraints they describe should be familiar from the Adobe Commerce DevDocs, and the engineer should have opinions formed from real experience.
For Hyva, ask how they handle component customization without breaking upgrades, how they integrate with existing Magento extensions that ship Luma frontend code, and what their migration approach looks like for a Magento 2 site with five years of legacy theme work. This is the kind of question Bemeir’s Hyva practice gets asked during enterprise selections, and the right answer involves named engineers, dated migration case studies, and a clear position on which legacy module patterns survive the rebuild. Hyva expertise is genuinely hard to fake because the platform is recent enough that most agency expertise is shallow. The Hyva Solution Partner program, listed on the Hyva website, is a meaningful filter.
For Shopify Plus, ask how they handle the move from checkout.liquid to checkout extensibility, how they architect headless storefronts using the Storefront API, and what their approach is to managing app sprawl on existing merchants. Ask about specific Shopify Functions they have built. Most agencies that say they do Shopify Plus mean they do Shopify themes; the deeper expertise around B2B configurations, Shopify Flow automation, and Functions development is rarer.
For Shopware, ask about their familiarity with the Shopware 6 architecture, the Symfony foundation underneath it, and the Shopware Solution Partner certification. Shopware is growing rapidly in the US mid-market, and agencies that legitimately do it well are still relatively rare.
Code Review as a Selection Tool
The most telling technical due diligence step is asking the agency to walk you through actual code from a recent project, with screen share, in a working session. Not slides about code, the code itself. Most agencies will agree to this with appropriate confidentiality protections, and the ones that refuse are filtering themselves out for you.
What to look for during this session: clean module structure, obvious adherence to the platform’s coding standards, sensible separation of concerns, comments and naming that suggest the original developer was thinking about future maintainers. The engineer walking you through it should be able to explain why each architectural decision was made and what they would change if they did it over.
For the agencies that will not show you live code, an alternative is asking for a small paid technical assessment: pay them for a week of work to produce a specific deliverable (an architecture review of your current platform, a migration assessment, a performance audit) and use that deliverable as your evaluation artifact. This is what Bemeir typically suggests to enterprise clients during the agency selection process: a paid discovery engagement that produces real value either way and gives both sides clear visibility into how the partnership will actually work.
Certifications: Which Ones Matter
Certifications are imperfect but useful as a screening filter. The ones that genuinely correlate with depth of expertise are the platform-vendor certifications that require demonstrated engineering work, not classroom-style certifications.
| Certification | Relevance to CTO Evaluation | What Good Looks Like |
|---|---|---|
| Adobe Certified Expert (Magento Backend Developer) | High for Magento delivery | Multiple certified engineers on staff, recent versions |
| Adobe Solution Partner tier (Bronze, Silver, Gold, Platinum) | High, but check the tier carefully | Gold or Platinum with current revenue tier maintained |
| Hyva Solution Partner | High for Hyva work | Listed on hyva.io with case studies |
| Shopify Partner (Plus, Premier, Expert) | Medium-High | Shopify Plus Partner with named contact at Shopify |
| Shopify Headless Theme certifications | Medium | Demonstrates current skill set, less critical than Plus tier |
| Shopware Solution Partner (Bronze, Silver, Gold) | High for Shopware work | Active Solution Partner with Shopware 6 work |
| BigCommerce Elite Partner | Medium | Indicates platform commitment, less common in mid-market |
| AWS certifications (Solutions Architect, DevOps) | High for self-hosted Magento on AWS | Multiple certified engineers across SA Pro and DevOps Pro |
| ISO 27001 / SOC 2 | High for enterprise procurement | Real audit, not just claims |
| Generic project management certifications | Low | Useful but not a depth indicator |
The pattern is consistent. Vendor-specific certifications at the higher tiers (Gold, Platinum, Premier, Elite) require ongoing revenue commitment and pass-through audits, which means agencies cannot fake them. Generic technology certifications are useful but do not differentiate platform expertise from generic engineering skill.
The other dimension that matters is recency. A Magento Certified Developer who passed in 2017 on Magento 2.1 has fundamentally different knowledge than one who passed in 2024 on Adobe Commerce 2.4.7. Ask for the certification dates.
Reference Customer Questions That Produce Honest Answers
Reference calls are mostly theater because the agency has chosen who you talk to. The questions that occasionally cut through this filter are the ones that ask about specific, comparable situations rather than general satisfaction.
Useful questions for reference calls include: what did you most underestimate during the engagement, what would you do differently if you started over, what surprised you about how the agency worked, and what is the one thing they could improve. The “improve” question is especially useful because every reference will have a real answer if you give them permission to share it. Watch for the answer that comes after the pause, not the rehearsed positive.
Other questions that surface real signal: how did the agency handle the moment when something went wrong, how were post-launch issues prioritized, did the people who pitched the work stay engaged through delivery, and would you hire them again for a different platform. The last question is interesting because it asks the reference to project the agency’s general competence, which often surfaces nuance that a question about the specific completed engagement would not.
Red Flags to Filter On
A handful of red flags reliably correlate with delivery problems. None are deal-breakers in isolation, but the pattern matters.
An agency that cannot describe its own technical hiring process in detail. An agency that staffs every project with a fresh team rather than maintaining domain consistency. An agency where the named technical lead in the proposal is not on a single client engagement currently. An agency that does not run automated testing on its production deliveries. An agency whose case studies all describe outcomes without describing the engineering decisions that produced them. An agency that hesitates when asked about its approach to the Adobe security patch cadence or the Shopify checkout extensibility migration.
According to Gartner research on technology services procurement, the strongest predictors of successful agency engagements are clear technical due diligence during selection, named delivery team continuity from sales through delivery, and explicit governance models for in-flight scope changes. Agencies that resist any of these are signaling something useful about how the engagement will run.
The Niche Depth Test
The strongest single test of platform expertise is asking about something niche. For Magento, ask about the new GraphQL persistent queries support, or about how they would tune Elasticsearch for a 200,000-SKU catalog with heavy attribute-based filtering. For Hyva, ask how they handle interaction between Hyva components and a third-party module that ships its own Luma templates. For Shopify Plus, ask about the differences between checkout extensions and the legacy Script Editor, and which workflows belong in Shopify Flow versus a dedicated middleware layer.
The agency with real depth will engage these questions as interesting problems and have specific answers. The agency without it will pivot to general principles or admit they would need to research the specifics. Both responses are useful information, but they tell you very different things about who you are about to hire.
What Selection Is Really For
The final way to think about this evaluation is that you are not hiring an agency to deliver a project. You are hiring an agency to make hundreds of small architectural decisions over the course of a multi-year relationship, each of which compounds into the long-term health of your commerce stack. Most of those decisions will happen without you in the room. The selection process is your one good chance to make sure the people making those decisions know what they are doing.
Bemeir’s pitch to enterprise CTOs is straightforward on this point: insist on the proper due diligence, ask the deeper





