February 6, 2026
Reliable AI Assistants: The EVAL-IA Case Study
Lessons for Italian Companies
When InfoCamere launched its first generative AI virtual assistants for Italian Chambers of Commerce, initial results were disappointing: only 50% of responses were accurate. One year later, the EVAL-IA project achieved 100% accuracy through over 60,000 automated tests. The most relevant insight? 80% of errors stemmed not from AI models, but from underlying data quality.
This counterintuitive discovery offers Italian businesses a crucial lesson in the generative AI era, where the national market reaches €1.8 billion and grows 50% annually: virtual assistant success isn’t bought with cutting-edge technology—it’s built with well-governed data.
When data matters more than algorithms
InfoCamere, consortium of Italy’s 88 Chambers of Commerce, faced a concrete challenge: ensuring measurable, governable, and reliable virtual assistant performance over time. As highlighted in Agenda Digitale’s analysis, the core issue isn’t algorithms but information structure.
Imagine an entrepreneur accessing their Chamber of Commerce portal seeking funding opportunities. Querying the virtual assistant about “the right grant for my business,” response quality hinges entirely on underlying data organization. Fragmented, outdated, or inconsistently described grant data leaves AI powerless—not due to model limitations, but forced interpretation of ambiguous material. The solution? Prepare “AI-readable” content: clearly structured, coherent information minimizing interpretation errors.
Many attribute virtual assistant errors to generative models. A year ago, with less mature technology, this held true. Today, InfoCamere’s analysis reveals a different reality: up to 80% of inaccuracies stem from poorly structured or incomplete source data—direct consequences of data fragility, not model inventions. EVAL-IA methodology built three pillars: data readiness, quality engineering with agent validator, and continuous governance. The system analyzed institutional portals mapping information gaps, delivered completeness and consistency metrics, then rewrote and reorganized content. Result: 130% monitoring productivity increase.
This challenges conventional practice. Most companies start with advanced Large Language Models, only to find accurate responses require more. Retrieval-Augmented Generation studies show precision dropping up to 30% in noisy datasets. InfoCamere proved the solution lies upstream in data quality, not downstream algorithmic optimization.
The agent validator: AI checking AI
InfoCamere developed an automated agent validator, a secondary AI system verifying primary assistant responses against expected targets. Far from isolated: the World Quality Report 2024 shows 68% of organizations use generative AI to optimize testing and automation.
60,000+ tests in 2025 would have been unthinkable manually. This automation shifted quality from reactive—fixing user-reported errors—to proactive: preventing issues before user exposure. An insurance company could replicate this generating thousands of automated queries, comparing responses against official policies, identifying discrepancies pre-deployment.
The framework balances automation and human oversight: the agent validator runs tests, but final content passes editor review. This hybrid of automated efficiency and human accountability will become standard for reliable AI systems.
Governance as strategic asset
EVAL-IA transforms AI response quality from “random effect” to “governed outcome”. InfoCamere’s approach goes beyond response verification: it builds scalable, reusable information bases across multiple models and services. Already applied across Italy’s 88 Chambers of Commerce, the methodology extends to Regions, Municipalities, and corporate contexts.
For large enterprises with distributed branches, this model delivers significant scale economics: initial governance framework investment replicates across sites, customizing content while maintaining uniform quality standards. For Italian SMEs representing 18% of the national AI market, scalability means extending virtual assistants from customer service to HR, sales, or operations without starting over. The methodology leverages scraping and asynchronous analysis techniques processing information sources in parallel.
Yet data reveals a significant gap: only 15% of medium enterprises and 7% of small ones have activated AI projects. This disparity offers concrete opportunity for companies adopting structured methodologies like EVAL-IA, turning AI governance into a competitive differentiator in customer service.
Regulatory compliance: from obligation to differentiator
The AI Act, effective February 2, 2025, classifies virtual assistants as “limited risk” systems – encompassing 70% of AI systems used by Italian digital companies. The regulatory obligation is explicit: immediately disclose the system’s artificial nature. Violations carry fines up to 3% of annual turnover.
InfoCamere transformed this regulatory constraint into strategic advantage. A virtual assistant openly declaring its AI nature and delivering 100% accurate responses builds greater trust than ambiguous systems. Italy, the EU’s first member state with national AI legislation (Law 132/2025, effective October 10, 2025 ), simultaneously established a €1 billion AI development fund.
Companies must update Organizational Models 231, define internal governance policies, and implement AI procurement toolkits. Those integrating transparency and accountability into system design will build lasting competitive advantage. In B2B markets where procurement rigorously assesses risk and regulatory compliance, a certified compliant virtual assistant with verifiable accuracy becomes a tangible commercial differentiator.
Costly mistakes to avoid
The technology-first approach remains the most common error: companies select advanced Large Language Models without data evaluation. InfoCamere proves 80% of errors stem from ambiguous data. The solution lies in source information quality, not model power.
Equally critical is underestimating continuous governance work. EVAL-IA required systematic portal analysis, content rewriting, precise metrics definition, and constant monitoring. These aren’t one-off tasks but ongoing commitment. Without dedicated resources, virtual assistant accuracy degrades progressively: data ages, policies change, new information fails integration. The result is a system losing reliability over time.
Deploying without automated validation poses another frequent risk. Without an agent validator continuously testing responses, errors surface through end users, causing immediate reputational damage and system trust erosion. Siloed management exacerbates these issues: projects run separately across IT, business units, legal, and compliance fail due to misalignment between data owners, functional requirement definers, and regulatory guarantors.
With the AI Act imposing fines up to 3% of turnover, integrating compliance by design from project outset is no longer optional. It’s a strategic choice determining entire initiative success or failure.
Strategic Adoption Principles
The EVAL-IA project crystallizes six immediately applicable strategic principles.
- First concerns investment priority: favor data quality over computational power, as even advanced models can’t compensate for poorly prepared information.
- Second principle involves building automated validation systems enabling manually impossible test volumes while maintaining human oversight on critical decisions.
- Third principle transforms quality control from operational cost to differentiation element: verifiable accuracy builds tangible commercial trust.
- Fourth suggests leveraging AI Act requirements to build client/partner trust beyond mere sanction avoidance: transparency paired with accurate responses becomes strategic asset.
- Fifth principle advises investing in replicable frameworks scaling from pilot implementation to full organization without restart.
- Sixth proposes using AI itself to verify AI, freeing human resources for higher-value strategic governance activities.
From case study to action
InfoCamere’s EVAL-IA project proves the 50% to 100% accuracy leap isn’t reserved for unlimited-budget organizations—it’s the product of precise methodological choices. With Italy’s AI market reaching €1.8 billion and growing 50% annually, and ~44,000 positions requiring AI skills (+93% year-over-year ), opportunities match responsibilities.
The AI Act takes effect February 2, 2025, Italian legislation October 10, 2025, fines reaching 3% of turnover. Italian companies face a choice: deploy half-effective virtual assistants, or adopt EVAL-IA principles prioritizing data quality over computational power, continuous governance over sporadic checks, transparency over opacity. The €1.8 billion market will reward ecosystem builders where AI realizes potential through structured data, automated validation, human oversight. This Italian excellence model InfoCamere proved possible.
Sources:
-
Agenda Digitale, “Trusting Virtual Assistants: The EVAL-IA Case by InfoCamere” https://www.agendadigitale.eu/documenti/affidarsi-agli-assistenti-virtuali-il-caso-eval-ia-di-infocamere/
-
InfoCamere, “EVAL-IA: InfoCamere’s Project for Reliable Artificial Intelligence” https://www.infocamere.it/eval_ia
-
Osservatorio Artificial Intelligence, Politecnico di Milano, “Artificial Intelligence in Italy: 50% Market Growth” https://www.osservatori.net/comunicato/artificial-intelligence/intelligenza-artificiale-italia/

Marta Magnini
Digital Marketing & Communication Assistant at Aidia, graduated in Communication Sciences and passionate about performing arts.
At Aidia, we develop AI-based software solutions, NLP solutions, Big Data Analytics, and Data Science. Innovative solutions to optimize processes and streamline workflows. To learn more, contact us or send an email to info@aidia.it.



