Bookmarks

Improve Inference Performance with LLMLingua

LLMLingua utilizes a compact, well-trained language model (e.g., GPT2-small, LLaMA-7B) to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models (LLMs), achieving up to 20x compression with minimal performance loss. https://github.com/microsoft/LLMLingua

Automated Unit Test Improvement using Large Language Models at Meta

https://arxiv.org/abs/2402.09171