PromptRL is a reinforcement learning project for learning cost-aware LLM configurations. It treats prompt strategy selection as a Q-learning problem and searches for the best combination of model.



PromptRL is a reinforcement learning project for learning cost-aware LLM configurations. It treats prompt strategy selection as a Q-learning problem and searches for the best combination of model, reasoning mode, and persona for different task difficulty levels. Instead of sending every task through the same model and prompt style, PromptRL explores a discrete action space and optimizes for output quality relative to inference cost.
Tasks Included
easy: short tweet generation
medium: constrained LinkedIn post generation
hard: beginner-friendly explanation of quantum computing