Papers Cache me if you Can: an Online Cost-aware Teacher-Student Framework to Reduce the Calls to Large Language Models (EMNLP 2023) We propose a framework for reducing calls to LLMs by caching previous LLM responses and using them to train a local inexpensive model. We measure the tradeoff between performance and cost. Experimental results show that significant cost savings can be obtained with only slightly lower performance.
Papers Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking (ACM ICAIF 2023) Standard Full-Data classifiers in NLP demand thousands of labeled examples, which is impractical in data-limited domains. Few-shot methods offer an alternative, utilizing contrastive learning techniques that can be effective with as little as 20 examples per class.