AI-powered interviews are no longer experimental. Gartner's 2025 HR Technology Survey found that 42% of enterprise companies now use some form of AI-assisted first-round screening — up from 14% in 2022. But "AI interview" has become a marketing category that obscures three fundamentally different technologies with different capabilities, candidate experiences, and evaluation reliability profiles.

This guide separates the formats, explains how each works technically, compares their evaluation quality against the evidence base, and provides a decision framework for choosing the right format for your hiring context.

The Three AI Interview Formats: What They Actually Do

Format 1: Real-Time Voice AI Interview The candidate has a live spoken conversation with an AI. The AI listens, processes what was said, and generates the next question in real time based on the response. The interview adapts — a candidate who claims expertise in distributed systems will receive follow-up questions that probe that claim; a candidate who gives a shallow answer will receive clarifying prompts.

This format mirrors a human first-round interview in structure: it asks about background, probes specific competencies, explores the candidate's actual depth beyond their resume, and produces a transcript and evaluation report.

Format 2: Async Video Interview The candidate receives a set of pre-recorded or text questions and records video responses in their own time. The AI analyzes the recordings — examining vocal tone, word choice, facial expressions, and answer content — and produces a score or ranking.

The critical distinction: async video AI cannot adapt. The questions are fixed. The AI cannot follow up on an interesting answer or probe a suspicious claim. It applies a scoring model to fixed inputs.

Format 3: Chatbot / Text Screening A text-based Q&A: the candidate answers questions by typing. The AI screens responses for keyword presence, answer length, and basic coherence. This is the lowest-fidelity format — it functions as an augmented application form rather than an interview.

FormatAdaptiveReal-TimeEvaluation DepthCandidate Drop-Off
Real-time voice AIYesYesHigh~19%
Async videoNoNoMedium~38%
Chatbot screeningPartiallyNoLow~12%

How Real-Time Voice AI Interviews Work

Understanding the technical architecture helps evaluate vendor claims:

Step 1: Job configuration. The hiring team inputs the job description, required competencies, and seniority level. The AI generates an interview structure — opening questions, competency-specific probes, depth questions — calibrated to the role.

Step 2: Real-time transcription. The candidate speaks; the AI transcribes using a speech-to-text model (typically Deepgram, Whisper, or equivalent). Latency matters here: models with >800ms transcription latency produce unnatural pauses that degrade conversation quality.

Step 3: Response analysis. The transcription is analyzed for completeness, specificity, and relevance to the competency being probed. This determines what the next question should be — a follow-up probe, a clarifying question, or an advancement to the next topic.

Step 4: Response generation. The AI generates the next question using an LLM (GPT-4, Claude, or equivalent), ensuring it flows naturally from what was just said.

Step 5: Text-to-speech output. The generated question is converted to speech and played to the candidate. Voice quality matters for candidate experience — robotic-sounding TTS increases perceived AI distance and reduces candid responses.

Step 6: Evaluation report. After the interview, the AI produces a structured report: competency scores, evidence (direct quotes), a recommendation, and the full transcript. For behavioral interview questions for engineers, this produces a STAR-format evidence trail that human reviewers can audit.

Evaluation Quality Comparison

The predictive validity of different interview formats is well-studied. Using Schmidt & Hunter's 1998 meta-analysis framework (updated in Sackett et al., 2022):

FormatPredictive ValidityInter-Rater ReliabilityBias Risk
Structured human interview0.510.67Moderate
Real-time voice AI (adaptive)0.45-0.53*1.0 (deterministic)Auditable
Async video AI0.28-0.350.60-0.72High (facial analysis)
Chatbot screening0.18-0.240.80Low
Unstructured human interview0.380.37High

*Range reflects variance across AI interview vendors and role types. Validated figures from Paradox/Olivia, HireVue academic studies, and Apriora internal research.

Key insight: Real-time voice AI achieves validity comparable to structured human interviews and significantly higher than async video. The inter-rater reliability of 1.0 (deterministic output for the same input) eliminates the interviewer variance problem entirely — every candidate is evaluated against exactly the same framework.

For system design interview evaluation, the adaptive format allows depth probing that matches what a senior engineer would cover in a structured round — at a fraction of the cost.

For interview scorecard and evaluation reports, AI-generated reports address the most common scorecard failure mode: incomplete evidence. An AI interview produces a full transcript with competency attribution — no interviewer memory gaps.

The major critique of AI interview validity is construct coverage: does the AI evaluate what matters for the role, or does it evaluate what it was trained to measure? This depends on configuration quality and ongoing validation against actual job performance data. Vendors who cannot show criterion validity studies should be treated skeptically.

Candidate Experience: What the Research Says

Candidate acceptance of AI interviews is context-dependent. A 2024 Phenom survey of 5,800 candidates found:

  • 61% found AI interviews acceptable when the process was transparent about AI involvement
  • 38% dropped out of async video screening before completing (vs 19% for real-time voice AI)
  • 73% preferred real-time voice AI over async video because it "felt like a real conversation"
  • 42% had privacy concerns about async video analysis, particularly facial expression AI
  • 54% wanted to know how the AI evaluation would be used and whether a human would review it

The candidate drop-off comparison is significant: async video's 38% drop-off rate is a direct cost — lost candidates who may have been qualified but found the format objectionable. For technical interview questions roles where top candidates have multiple options, format friction directly affects pipeline conversion.

Candidate perception also varies by demographic. A 2023 HireVue independent audit found video analysis tools (which analyze facial expressions and vocal tone) produced significantly different scores for candidates from different national backgrounds — a finding that has produced regulatory scrutiny in multiple jurisdictions.

When to Use Each Format

Hiring ContextRecommended FormatReason
Senior/staff engineer (low volume, high stakes)Real-time voice AI or structured humanDepth and adaptability matter; async video insufficient
High-volume lateral hiring (50+ roles/quarter)Real-time voice AIEliminates human first-round bottleneck while maintaining depth
Graduate/entry-level (very high volume)Async video or chatbot for qualification, voice AI for qualified poolCost optimization at scale
Domain expert roles (finance, legal, healthcare)Real-time voice AI with domain-specific configurationRequires adaptive questioning; chatbot/video cannot probe domain depth
Culture/values alignment checkStructured human or voice AI with behavioral competency focusSpecific behavioral competency evidence needed
Initial qualification screen (salary range, availability, work authorization)ChatbotLow complexity; over-investing in voice AI here is wasteful

How to Evaluate AI Interview Vendors

Four questions that cut through marketing claims:

1. Is the interview adaptive? Ask for a live demonstration where the same answer is given twice with different depth. Does the follow-up question change based on the depth of the response? If not, it is async video rebranded as voice AI.

2. Can you review a sample transcript and report? The transcript shows you exactly what the AI asked and what was captured. The report shows how evaluation scores map to specific candidate statements. If a vendor cannot show this, they cannot demonstrate evaluation reliability.

3. What is the criterion validity for the role type? Predictive validity (does the AI's score predict actual job performance?) is the only metric that matters for evaluation quality. Vendors without criterion validity data are selling correlational claims, not demonstrated prediction.

4. What is the regulatory compliance posture? For US hiring: does the vendor support NYC LL144 bias audit compliance? For EU hiring: is the system documented under the EU AI Act high-risk framework? Absence of compliance documentation is a material risk for enterprise buyers.

Red flags: Vendors who lead with facial expression analysis (banned in several jurisdictions); vendors who cannot produce sample reports; vendors who claim 'no bias' without an independent audit; and vendors who reuse the same question bank for all roles without configuration.

How Nextmantra AI Approaches This

Nextmantra AI is a real-time voice AI interview platform purpose-built for first-round replacement. The interview format is a 45-minute adaptive conversation — not pre-recorded questions, not a chatbot, not async video. The AI reads the job description, generates a competency-specific question structure, and adapts every follow-up based on what the candidate says.

For engineering roles, this means a candidate who claims "5 years of React" will face follow-up questions on hooks vs class components, state management trade-offs, and performance optimization patterns — until the AI reaches the actual boundary of their knowledge. For sales roles, a candidate who claims "consultative selling" experience will be probed on specific objection-handling techniques, pipeline management, and deal qualification processes. The AI does not accept surface-level answers as evidence — it probes until it finds depth or absence.

Every interview produces a structured evaluation report with competency scores, direct-quoted evidence, and a recommendation — the equivalent of a completed interview scorecard. Human reviewers receive pre-scored candidates with evidence, not a resume and a gut feeling. See how Nextmantra AI handles this

Frequently Asked Questions

What is an AI-powered interview?

An AI-powered interview is a candidate evaluation conducted by artificial intelligence rather than a human. The three formats are real-time voice AI (adaptive live conversation), async video (pre-set questions recorded and analyzed), and chatbot screening (text-based Q&A). They differ significantly in depth, adaptability, and candidate experience.

How accurate are AI interviews compared to human interviews?

Real-time voice AI interviews with adaptive questioning show predictive validity of 0.45-0.53, comparable to structured human interviews (0.51). Async video interviews show lower validity (0.28-0.35). Chatbot screening is lowest (0.18-0.24). The key variable is whether the format uses structured evaluation criteria — not AI vs human.

Do candidates accept AI interviews?

A 2024 Phenom survey of 5,800 candidates found 61% accepted AI interviews when the process was transparent and the AI gave a genuine interactive experience. Async video had a 38% drop-off rate before completion, versus 19% for real-time voice AI. Candidates preferred voice AI because it felt like a real conversation.

Can AI interviews detect lying or resume fraud?

Real-time voice AI with adaptive questioning exposes surface-level candidates not through lie detection but through depth probing. A candidate who claimed a skill but cannot demonstrate it is exposed by the AI's ability to generate unlimited adaptive follow-up questions without fatigue. Async video cannot do this — it applies fixed analysis to fixed answers.

What are the legal considerations for AI interviews?

Key regulations: NYC Local Law 144 requires annual bias audits for automated hiring tools; Illinois and Maryland require notice to candidates; the EU AI Act classifies hiring AI as high-risk. At minimum: disclose AI involvement, document evaluation criteria, retain records, and ensure human review of AI recommendations before final decisions.

What is the difference between AI video interviews and AI voice interviews?

AI video interviews are asynchronous — candidates record responses to fixed questions, which are analyzed afterward. AI voice interviews are real-time — the AI listens and generates the next question based on what the candidate just said. Voice AI can adapt; video AI cannot. This is the defining quality differentiator.

How long should an AI-powered first-round interview be?

45 to 55 minutes is optimal for a senior-level first-round AI voice interview. For phone screen replacement at junior levels, 20-25 minutes covers minimum qualifications and communication. Shorter than 20 minutes produces insufficient data; longer than 60 minutes reduces candidate completion rates.

Can AI interviews handle technical roles fairly?

Yes, when configured with role-specific question banks and evaluation criteria. An AI interview for a senior backend engineer should probe distributed systems, failure modes, and architecture trade-offs. The configuration input is the job description and required competency framework. AI interviews using the same question bank for all roles produce generic evaluation.

Conclusion

AI-powered interviews are a category, not a product — and the category spans formats with dramatically different quality profiles. Real-time voice AI with adaptive questioning produces evaluation quality comparable to structured human interviews at a fraction of the cost. Async video and chatbot screening produce lower validity and higher candidate friction for senior roles. The decision framework is straightforward: the higher the stakes and the more domain depth required, the more adaptive the format must be.

See how a 45-minute adaptive AI voice interview compares to your current first round: [Nextmantra AI Platform](https://nextmantra.ai/platform)

Sources: Gartner HR Technology Survey 2025; Schmidt & Hunter (1998) / Sackett et al. (2022), Psychological Bulletin; Phenom Candidate Experience Survey 2024; HireVue Independent Bias Audit 2023 (Hanold Associates); Paradox AI Validity Study 2024.