# QLANKR Test > QLANKR Test is an AI evaluation platform. Users submit AI agent output (chat logs, RAG Q&A pairs, tool call traces, classification results, generated content) and receive a scored report with a QI (QLANKR Intelligence) composite score from 0 to 100. ## What QLANKR Test does QLANKR Test evaluates AI systems across multiple quality dimensions using independent AI judges. It produces a QI score, per-dimension breakdowns, identified strengths, and specific improvement recommendations. Results are presented as shareable report cards with permanent verification URLs. ## Who it is for - Developers building AI agents, chatbots, or automated systems - Teams evaluating RAG pipelines, tool-calling agents, or content generation - Anyone who needs structured, repeatable AI quality assessment ## Assessment types There are 10 assessment templates: - Support Agent Assessment: evaluates customer service chatbots for accuracy, tone, completeness, escalation handling, and safety - RAG Accuracy Check: tests retrieval-augmented generation for faithfulness, relevancy, hallucination resistance, and citation quality - Tool-Use Correctness: evaluates tool selection, parameter accuracy, sequencing, and error handling - Prompt Robustness: tests jailbreak resistance, instruction following, safety, and graceful refusal - Content Generation Quality: evaluates factual accuracy, coherence, style, completeness, and originality - Multi-Agent Coordination: tests delegation logic, coordination, conflict resolution, and task completion - Classification & Extraction: evaluates label accuracy, extraction completeness, format compliance, and edge cases - Agent Production Readiness: tests reliability, latency, error recovery, observability, and graceful degradation - Code Generation Accuracy: evaluates functional correctness, code quality, security, documentation, and edge cases - General Readiness Checklist: self-assessment covering error handling, fallback behavior, monitoring, security, and UX ## QI Scoring QI (QLANKR Intelligence) is a composite score from 0 to 100. It is the average of dimension scores, each independently evaluated by an AI judge. Pro users get dual-judge scoring with agreement metrics. Scores map to bands: Strong (90-100), Moderate (70-89), Developing (40-69), Early (0-39). ## Access and pricing - Anonymous (no account): 1 AI-judged evaluation per day, single judge, report is auto-public. The General Readiness Checklist (self-assessed) works without limits and without login. - Free account: 3 AI-judged evaluations per day, single judge, last 5 reports saved, 2 public reports. - Pro ($19/month, or $15/month billed annually at $180/year): 25 evaluations per day, dual-judge scoring with agreement metrics, unlimited report history, unlimited public reports, PDF export, custom rubrics, stability checks, programmatic API access, webhooks, per-item breakdown scoring. ## Links - Homepage: https://test.qlankr.com - Pricing: https://test.qlankr.com/pricing - API docs: https://test.qlankr.com/test/api - Methodology: https://test.qlankr.com/methodology - Guides: https://test.qlankr.com/guides - FAQ: https://test.qlankr.com/faq - Public reports: https://test.qlankr.com/reports