|
1 | | -<div align="center"> |
2 | | - <img src="figures/logo.png" alt="logo" width="50%"> |
3 | | -</div> |
| 1 | +```markdown |
| 2 | +# 🌟 Batch LLM Inference with Ray Data LLM 🌟 |
| 3 | +## From Simple to Advanced |
4 | 4 |
|
5 | | -# Batch LLM Inference with Ray Data LLM |
| 5 | +Welcome to the **Batch LLM Inference with Ray Data LLM** repository! This project aims to simplify and enhance the use of large language models (LLMs) for batch inference in various applications. By leveraging the capabilities of Ray Data and Ray Serve, this repository enables efficient parallel processing and distributed computing. |
6 | 6 |
|
7 | | -## Table of Contents |
| 7 | + |
8 | 8 |
|
9 | | -<details> |
10 | | - <summary><a href="#1-introduction"><i><b>1. Introduction</b></i></a></summary> |
11 | | - <div> |
12 | | - <a href="#11-ray">1.1. Ray</a><br> |
13 | | - <a href="#12-ray-data-llm">1.2. Ray Data LLM</a><br> |
14 | | - <a href="#13-problem-generate-responses-to-common-questions">1.3. Problem: Generate Responses to Common Questions</a><br> |
15 | | - <a href="#14-what-you-want">1.4. What You Want</a><br> |
16 | | - <a href="#15-how-ray-data-llm-helps">1.5. How Ray Data LLM Helps</a><br> |
17 | | - <a href="#16-why-use-ray-data-llm">1.6. Why Use Ray Data LLM</a><br> |
18 | | - <a href="#17-next-steps">1.7. Next Steps</a><br> |
19 | | - </div> |
20 | | -</details> |
| 9 | +--- |
21 | 10 |
|
22 | | -<details> |
23 | | - <summary><a href="#2-fundamental-level-generating-responses-to-common-questions"><i><b>2. Fundamental Level: Generating Responses to Common Questions</b></i></a></summary> |
24 | | - <div> |
25 | | - <a href="#21-problem">2.1. Problem</a><br> |
26 | | - <a href="#22-example-questions">2.2. Example Questions</a><br> |
27 | | - <a href="#23-why-batch-processing">2.3. Why Batch Processing</a><br> |
28 | | - <a href="#24-real-life-example">2.4. Real-Life Example</a><br> |
29 | | - </div> |
30 | | -</details> |
| 11 | +## 📚 Table of Contents |
31 | 12 |
|
32 | | -<details> |
33 | | - <summary><a href="#3-intermediate-level-text-generation-from-user-prompts"><i><b>3. Intermediate Level: Text Generation from User Prompts</b></i></a></summary> |
34 | | - <div> |
35 | | - <a href="#31-problem">3.1. Problem</a><br> |
36 | | - <a href="#32-example-prompts">3.2. Example Prompts</a><br> |
37 | | - <a href="#33-why-use-ray-data-llm">3.3. Why Use Ray Data LLM</a><br> |
38 | | - </div> |
39 | | -</details> |
| 13 | +- [Features](#features) |
| 14 | +- [Topics](#topics) |
| 15 | +- [Getting Started](#getting-started) |
| 16 | +- [Installation](#installation) |
| 17 | +- [Usage](#usage) |
| 18 | +- [Examples](#examples) |
| 19 | +- [Contributing](#contributing) |
| 20 | +- [License](#license) |
| 21 | +- [Contact](#contact) |
40 | 22 |
|
41 | | -<details> |
42 | | - <summary><a href="#4-advanced-level-real-world-use-case"><i><b>4. Advanced Level: Real-World Use Case</b></i></a></summary> |
43 | | - <div> |
44 | | - <a href="#41-problem">4.1. Problem</a><br> |
45 | | - <a href="#42-why-batch-processing">4.2. Why Batch Processing</a><br> |
46 | | - <a href="#43-how-ray-data-llm-helps">4.3. How Ray Data LLM Helps</a><br> |
47 | | - </div> |
48 | | -</details> |
| 23 | +--- |
49 | 24 |
|
50 | | -## 1. Introduction |
| 25 | +## ⚡ Features |
51 | 26 |
|
52 | | -Think about a customer support bot that receives 1000 similar questions every day. Instead of answering each one as it comes, you could collect the questions and process them in bulk (Batch processing), giving quick responses to all of them at once. |
| 27 | +- **Batch Inference**: Process multiple inputs in a single call for efficiency. |
| 28 | +- **Distributed Computing**: Scale your applications seamlessly across multiple nodes. |
| 29 | +- **Support for Large Language Models**: Utilize state-of-the-art models for natural language processing tasks. |
| 30 | +- **Parallel Processing**: Speed up tasks by processing data in parallel. |
| 31 | +- **Integration with Ray Data and Ray Serve**: Take advantage of robust tools for distributed applications. |
53 | 32 |
|
54 | | -Batch processing is essential when generating text responses efficiently using large language models (LLMs). Instead of processing each input one by one, batch inference allows the handling of multiple inputs at once, making it faster and more scalable. |
| 33 | +## 🏷️ Topics |
55 | 34 |
|
56 | | -In this tutorial, we will start with a straightforward problem to introduce the concept of batch inference with Ray Data LLM. We will then progress to more advanced scenarios, gradually building your understanding and skills. |
| 35 | +This repository covers a variety of topics related to large language models and distributed computing, including: |
57 | 36 |
|
58 | | -### 1.1. Ray |
| 37 | +- batch-inference |
| 38 | +- distributed-computing |
| 39 | +- large-language-models |
| 40 | +- llm |
| 41 | +- llm-api |
| 42 | +- nlp |
| 43 | +- parallel-processing |
| 44 | +- ray |
| 45 | +- ray-data |
| 46 | +- ray-serve |
| 47 | +- vllm |
59 | 48 |
|
60 | | -Ray is an open-source framework for building and running distributed applications. It enables developers to scale Python applications from a single machine to a cluster, making parallel computing more accessible. Ray provides a unified API for various tasks such as data processing, model training, and inference. |
| 49 | +--- |
61 | 50 |
|
62 | | -Key Features: |
| 51 | +## 🚀 Getting Started |
63 | 52 |
|
64 | | -* Distributed computing made easy with Pythonic syntax |
65 | | -* Built-in support for machine learning workflows |
66 | | -* Libraries for hyperparameter tuning (Ray Tune), model serving (Ray Serve), and data processing (Ray Data) |
67 | | -* Scales seamlessly from a single machine to a large cluster |
| 53 | +To begin, you need to set up your environment. Follow the installation instructions below to get started quickly. |
68 | 54 |
|
69 | | -### 1.2. Ray Data LLM |
| 55 | +### Prerequisites |
70 | 56 |
|
71 | | -Ray Data LLM is a module within Ray designed for batch inference using large language models (LLMs). It allows developers to perform batch text generation efficiently by integrating LLM processing into existing Ray Data pipelines. |
| 57 | +- Python 3.7 or later |
| 58 | +- pip |
| 59 | +- Access to a GPU (optional, but recommended for performance) |
72 | 60 |
|
73 | | -Key Features: |
| 61 | +--- |
74 | 62 |
|
75 | | -* Batch Processing: |
76 | | - * Simultaneous processing of multiple inputs for faster throughput |
77 | | -* Efficient Scaling: |
78 | | - * Distributes tasks across multiple CPUs/GPUs |
79 | | -* Seamless Integration: |
80 | | - * Works with popular LLMs like Meta Llama |
81 | | -* OpenAI Compatibility: |
82 | | - * Easily integrates with OpenAI API endpoints |
| 63 | +## 🛠️ Installation |
83 | 64 |
|
84 | | -By using Ray Data LLM, developers can scale their LLM-based applications without the hassle of manual optimization or distributed computing setups. |
| 65 | +Clone the repository to your local machine: |
85 | 66 |
|
86 | | -### 1.3. Problem: Generate Responses to Common Questions |
| 67 | +```bash |
| 68 | +git clone https://github.com/ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM.git |
| 69 | +cd Batch_LLM_Inference_with_Ray_Data_LLM |
| 70 | +``` |
87 | 71 |
|
88 | | -Imagine you have a list of common questions people might ask a chatbot. Instead of responding to each question one by one, you want to generate answers for all the questions at once. |
| 72 | +Install the required packages: |
89 | 73 |
|
90 | | -#### 1.3.1. Example Questions: |
| 74 | +```bash |
| 75 | +pip install -r requirements.txt |
| 76 | +``` |
91 | 77 |
|
92 | | -1. "What is the weather like today?" |
93 | | -2. "Tell me a joke." |
94 | | -3. "Give me a motivational quote." |
95 | | -4. "What is 2+2?" |
| 78 | +--- |
96 | 79 |
|
97 | | -### 1.4. What You Want: |
| 80 | +## 📖 Usage |
98 | 81 |
|
99 | | -Instead of sending each question to the chatbot separately, you want to batch process them all at once, saving time and computing resources. |
| 82 | +To use the batch inference feature, follow these simple steps. First, load your model and prepare your data. |
100 | 83 |
|
101 | | -### 1.5. How Ray Data LLM Helps: |
| 84 | +### Example Code |
102 | 85 |
|
103 | | -* **Batch Processing:** Instead of generating responses one by one, Ray Data LLM lets you process all questions simultaneously. |
104 | | -* **Efficient Scaling:** If you have 1000 questions, the system can distribute the workload across multiple processors, making it much faster. |
| 86 | +Here’s a basic example to get you started: |
105 | 87 |
|
106 | | -### 1.6. Why Use Ray Data LLM? |
| 88 | +```python |
| 89 | +from ray import serve |
107 | 90 |
|
108 | | -Ray Data LLM integrates seamlessly with existing Ray Data pipelines, allowing for efficient batch inference with LLMs. It enables: |
| 91 | +# Initialize Ray |
| 92 | +import ray |
| 93 | +ray.init() |
109 | 94 |
|
110 | | -* High-throughput processing |
111 | | -* Distributed execution |
112 | | -* Integration with OpenAI-compatible endpoints |
113 | | -* Support for popular LLMs like Meta Llama |
| 95 | +# Load your LLM |
| 96 | +@serve.deployment |
| 97 | +def llm_inference(request): |
| 98 | + # Logic for LLM inference here |
| 99 | + pass |
114 | 100 |
|
115 | | -### 1.7. Next Steps |
| 101 | +# Deploy the service |
| 102 | +serve.run(llm_inference.bind()) |
| 103 | +``` |
116 | 104 |
|
117 | | -Ray Data LLM provides a simple and scalable way to perform batch inference using LLMs. Starting from basic chatbot responses to complex sentiment analysis, it enables high-throughput text generation and processing. |
| 105 | +--- |
118 | 106 |
|
119 | | -* Set up Ray and Ray Data on your system. |
120 | | -* Follow the provided code examples to implement the discussed problems. |
121 | | -* Experiment with different models and configurations for better performance. |
| 107 | +## 📊 Examples |
122 | 108 |
|
123 | | -## 2. Fundamental Level: Generating Responses to Common Questions |
| 109 | +Explore practical examples to see how batch inference works with different LLMs. Each example demonstrates a specific feature or use case. |
124 | 110 |
|
125 | | -### 2.1. Problem |
| 111 | +### Example 1: Sentiment Analysis |
126 | 112 |
|
127 | | -Imagine you have a list of common questions that people might ask a chatbot. Instead of generating a response for each question one by one, you want to generate answers for all the questions at once. |
| 113 | +Use the LLM to analyze the sentiment of a batch of texts: |
128 | 114 |
|
129 | | -#### 2.2. Example Questions: |
| 115 | +```python |
| 116 | +texts = ["I love this!", "This is terrible."] |
| 117 | +results = llm_inference(texts) |
| 118 | +print(results) |
| 119 | +``` |
130 | 120 |
|
131 | | -1. "What is the weather like today?" |
132 | | -2. "Tell me a joke." |
133 | | -3. "Give me a motivational quote." |
134 | | -4. "What is 2+2?" |
| 121 | +### Example 2: Text Generation |
135 | 122 |
|
136 | | -### 2.3. Why Batch Processing? |
| 123 | +Generate text based on a prompt: |
137 | 124 |
|
138 | | -Instead of sending each question to the chatbot separately, batch processing allows you to process all questions simultaneously, saving time and computing resources. |
| 125 | +```python |
| 126 | +prompts = ["Once upon a time", "In a galaxy far away"] |
| 127 | +generated_texts = llm_inference(prompts) |
| 128 | +print(generated_texts) |
| 129 | +``` |
139 | 130 |
|
140 | | -#### 2.4. Real-Life Example |
| 131 | +--- |
141 | 132 |
|
142 | | -Think about a customer support bot that receives thousands of similar questions daily. Instead of answering each one as it arrives, you could collect them and process them all at once, providing quick responses efficiently. |
| 133 | +## 🤝 Contributing |
143 | 134 |
|
144 | | -## 3. Intermediate Level: Text Generation from User Prompts |
| 135 | +Contributions are welcome! Please follow these steps to contribute: |
145 | 136 |
|
146 | | -### 3.1. Problem |
| 137 | +1. Fork the repository. |
| 138 | +2. Create a new branch (`git checkout -b feature-branch`). |
| 139 | +3. Make your changes and commit them (`git commit -m 'Add new feature'`). |
| 140 | +4. Push to your branch (`git push origin feature-branch`). |
| 141 | +5. Create a pull request. |
147 | 142 |
|
148 | | -Now, imagine you have a list of user prompts for creative text generation. Instead of generating each text separately, you want to create a pipeline that processes all prompts together. |
| 143 | +--- |
149 | 144 |
|
150 | | -#### 3.2. Example Prompts: |
| 145 | +## 📄 License |
151 | 146 |
|
152 | | -* "Write a haiku about nature." |
153 | | -* "Generate a short story about space exploration." |
154 | | -* "Summarize the following paragraph..." |
| 147 | +This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details. |
155 | 148 |
|
156 | | -### 3.3. Why Use Ray Data LLM? |
| 149 | +--- |
157 | 150 |
|
158 | | -* You can use Ray Data to load a large number of prompts at once. |
159 | | -* Ray Data LLM allows you to batch process these prompts with the selected LLM model. |
| 151 | +## 📬 Contact |
160 | 152 |
|
161 | | -## 4. Advanced Level: Real-World Use Case |
| 153 | +For questions or support, feel free to reach out: |
162 | 154 |
|
163 | | -### 4.1. Problem |
| 155 | +- **GitHub**: [ARYAN555279](https://github.com/ARYAN555279) |
| 156 | +- **Email**: your_email@example.com |
164 | 157 |
|
165 | | -You have a dataset containing thousands of social media posts. You want to classify the sentiment (positive, negative, neutral) of each post using an LLM. |
| 158 | +--- |
166 | 159 |
|
167 | | -### 4.2. Why Batch Processing? |
| 160 | +## 📦 Releases |
168 | 161 |
|
169 | | -Performing sentiment analysis on each post individually would take too much time. By batching, you can classify the entire dataset at once. |
| 162 | +You can find the latest releases [here](https://github.com/ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM/releases). Make sure to download and execute the necessary files to get started. |
170 | 163 |
|
171 | | -#### 4.3. How Ray Data LLM Helps: |
| 164 | +[](https://github.com/ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM/releases) |
172 | 165 |
|
173 | | -* Efficiently loads the dataset. |
174 | | -* Applies the LLM processor to generate sentiment labels. |
175 | | -* Uses distributed processing to handle large volumes of data efficiently. |
| 166 | +--- |
| 167 | + |
| 168 | +## 🎉 Conclusion |
| 169 | + |
| 170 | +Thank you for checking out the **Batch LLM Inference with Ray Data LLM** repository. We hope this project aids in your journey with large language models. Happy coding! |
| 171 | + |
| 172 | + |
| 173 | +``` |
0 commit comments