How to evaluate RAG with classification metrics (GRACE) In this blog post, we introduce GRACE, which stands for "Grounded Retrieval-Augmented Citation Evaluation", a technique that helps us evaluate LLMs effectively and affordably using simple classification metrics (like accuracy), just by comparing the LLM-selected citations vs. the annotated citation IDs of an LLM-generated answer. In order to enable GRACE
Papers Cache me if you Can: an Online Cost-aware Teacher-Student Framework to Reduce the Calls to Large Language Models (EMNLP 2023) We propose a framework for reducing calls to LLMs by caching previous LLM responses and using them to train a local inexpensive model. We measure the tradeoff between performance and cost. Experimental results show that significant cost savings can be obtained with only slightly lower performance.
Papers Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking (ACM ICAIF 2023) Standard Full-Data classifiers in NLP demand thousands of labeled examples, which is impractical in data-limited domains. Few-shot methods offer an alternative, utilizing contrastive learning techniques that can be effective with as little as 20 examples per class.
AI models for classifying green plastics patents Helvia's Stavros Vassos (CEO) and Odysseas Papadiamantopoulos (ML Engineer) joined the AI4EPO team to develop novel AI models for the European Patent Office CodeFest on Green Plastics and won first place.