Aggregates outputs from multiple AI models or evaluators into a single consensus response with confidence scores and justification.