Self-Critique

advanced
TechniquesLast updated: 2025-01-15
Also known as: self-evaluation, self-verification

What is Self-Critique?


Self-critique is the capability for AI agents to evaluate and improve their own outputs through iterative refinement. Rather than accepting the first generated response, agents with self-critique abilities can review their work, identify problems or weaknesses, and generate improved versions. This self-evaluation and revision process can significantly improve output quality, particularly for complex tasks where initial attempts may be flawed or incomplete.


The process typically involves generating an initial response, prompting the model to critique that response by identifying errors, inconsistencies, or areas for improvement, and using the critique to generate a revised response. This can be repeated for multiple rounds of refinement. The critique step might use specific criteria (factual accuracy, completeness, coherence, etc.) and can be enhanced by retrieving information to verify claims or by using specialized evaluation models to assess quality.


Self-critique has proven valuable for improving agent reliability and output quality without human intervention. It's particularly effective for tasks like writing (where iterative refinement improves quality), reasoning (where verifying logical steps catches errors), and code generation (where testing can reveal bugs). Systems like Self-RAG incorporate critique into the retrieval-generation loop, evaluating whether retrieved information is relevant and whether generated content is supported by sources. The main challenges are increased computational cost from multiple generation rounds and the risk of the model being unable to identify its own errors, particularly systematic ones.


Related Terms