Skip to content

Commit 1dff07b

Browse files
committed
Commit
1 parent 1fbc34f commit 1dff07b

File tree

1 file changed

+117
-119
lines changed

1 file changed

+117
-119
lines changed

README.md

Lines changed: 117 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -1,175 +1,173 @@
1-
<div align="center">
2-
<img src="figures/logo.png" alt="logo" width="50%">
3-
</div>
1+
```markdown
2+
# 🌟 Batch LLM Inference with Ray Data LLM 🌟
3+
## From Simple to Advanced
44

5-
# Batch LLM Inference with Ray Data LLM
5+
Welcome to the **Batch LLM Inference with Ray Data LLM** repository! This project aims to simplify and enhance the use of large language models (LLMs) for batch inference in various applications. By leveraging the capabilities of Ray Data and Ray Serve, this repository enables efficient parallel processing and distributed computing.
66

7-
## Table of Contents
7+
![Ray Logo](https://raw.githubusercontent.com/ray-project/ray/master/python/ray/assets/ray_logo.svg)
88

9-
<details>
10-
<summary><a href="#1-introduction"><i><b>1. Introduction</b></i></a></summary>
11-
<div>
12-
<a href="#11-ray">1.1. Ray</a><br>
13-
<a href="#12-ray-data-llm">1.2. Ray Data LLM</a><br>
14-
<a href="#13-problem-generate-responses-to-common-questions">1.3. Problem: Generate Responses to Common Questions</a><br>
15-
<a href="#14-what-you-want">1.4. What You Want</a><br>
16-
<a href="#15-how-ray-data-llm-helps">1.5. How Ray Data LLM Helps</a><br>
17-
<a href="#16-why-use-ray-data-llm">1.6. Why Use Ray Data LLM</a><br>
18-
<a href="#17-next-steps">1.7. Next Steps</a><br>
19-
</div>
20-
</details>
9+
---
2110

22-
<details>
23-
<summary><a href="#2-fundamental-level-generating-responses-to-common-questions"><i><b>2. Fundamental Level: Generating Responses to Common Questions</b></i></a></summary>
24-
<div>
25-
<a href="#21-problem">2.1. Problem</a><br>
26-
<a href="#22-example-questions">2.2. Example Questions</a><br>
27-
<a href="#23-why-batch-processing">2.3. Why Batch Processing</a><br>
28-
<a href="#24-real-life-example">2.4. Real-Life Example</a><br>
29-
</div>
30-
</details>
11+
## 📚 Table of Contents
3112

32-
<details>
33-
<summary><a href="#3-intermediate-level-text-generation-from-user-prompts"><i><b>3. Intermediate Level: Text Generation from User Prompts</b></i></a></summary>
34-
<div>
35-
<a href="#31-problem">3.1. Problem</a><br>
36-
<a href="#32-example-prompts">3.2. Example Prompts</a><br>
37-
<a href="#33-why-use-ray-data-llm">3.3. Why Use Ray Data LLM</a><br>
38-
</div>
39-
</details>
13+
- [Features](#features)
14+
- [Topics](#topics)
15+
- [Getting Started](#getting-started)
16+
- [Installation](#installation)
17+
- [Usage](#usage)
18+
- [Examples](#examples)
19+
- [Contributing](#contributing)
20+
- [License](#license)
21+
- [Contact](#contact)
4022

41-
<details>
42-
<summary><a href="#4-advanced-level-real-world-use-case"><i><b>4. Advanced Level: Real-World Use Case</b></i></a></summary>
43-
<div>
44-
<a href="#41-problem">4.1. Problem</a><br>
45-
<a href="#42-why-batch-processing">4.2. Why Batch Processing</a><br>
46-
<a href="#43-how-ray-data-llm-helps">4.3. How Ray Data LLM Helps</a><br>
47-
</div>
48-
</details>
23+
---
4924

50-
## 1. Introduction
25+
## ⚡ Features
5126

52-
Think about a customer support bot that receives 1000 similar questions every day. Instead of answering each one as it comes, you could collect the questions and process them in bulk (Batch processing), giving quick responses to all of them at once.
27+
- **Batch Inference**: Process multiple inputs in a single call for efficiency.
28+
- **Distributed Computing**: Scale your applications seamlessly across multiple nodes.
29+
- **Support for Large Language Models**: Utilize state-of-the-art models for natural language processing tasks.
30+
- **Parallel Processing**: Speed up tasks by processing data in parallel.
31+
- **Integration with Ray Data and Ray Serve**: Take advantage of robust tools for distributed applications.
5332

54-
Batch processing is essential when generating text responses efficiently using large language models (LLMs). Instead of processing each input one by one, batch inference allows the handling of multiple inputs at once, making it faster and more scalable.
33+
## 🏷️ Topics
5534

56-
In this tutorial, we will start with a straightforward problem to introduce the concept of batch inference with Ray Data LLM. We will then progress to more advanced scenarios, gradually building your understanding and skills.
35+
This repository covers a variety of topics related to large language models and distributed computing, including:
5736

58-
### 1.1. Ray
37+
- batch-inference
38+
- distributed-computing
39+
- large-language-models
40+
- llm
41+
- llm-api
42+
- nlp
43+
- parallel-processing
44+
- ray
45+
- ray-data
46+
- ray-serve
47+
- vllm
5948

60-
Ray is an open-source framework for building and running distributed applications. It enables developers to scale Python applications from a single machine to a cluster, making parallel computing more accessible. Ray provides a unified API for various tasks such as data processing, model training, and inference.
49+
---
6150

62-
Key Features:
51+
## 🚀 Getting Started
6352

64-
* Distributed computing made easy with Pythonic syntax
65-
* Built-in support for machine learning workflows
66-
* Libraries for hyperparameter tuning (Ray Tune), model serving (Ray Serve), and data processing (Ray Data)
67-
* Scales seamlessly from a single machine to a large cluster
53+
To begin, you need to set up your environment. Follow the installation instructions below to get started quickly.
6854

69-
### 1.2. Ray Data LLM
55+
### Prerequisites
7056

71-
Ray Data LLM is a module within Ray designed for batch inference using large language models (LLMs). It allows developers to perform batch text generation efficiently by integrating LLM processing into existing Ray Data pipelines.
57+
- Python 3.7 or later
58+
- pip
59+
- Access to a GPU (optional, but recommended for performance)
7260

73-
Key Features:
61+
---
7462

75-
* Batch Processing:
76-
* Simultaneous processing of multiple inputs for faster throughput
77-
* Efficient Scaling:
78-
* Distributes tasks across multiple CPUs/GPUs
79-
* Seamless Integration:
80-
* Works with popular LLMs like Meta Llama
81-
* OpenAI Compatibility:
82-
* Easily integrates with OpenAI API endpoints
63+
## 🛠️ Installation
8364

84-
By using Ray Data LLM, developers can scale their LLM-based applications without the hassle of manual optimization or distributed computing setups.
65+
Clone the repository to your local machine:
8566

86-
### 1.3. Problem: Generate Responses to Common Questions
67+
```bash
68+
git clone https://github.com/ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM.git
69+
cd Batch_LLM_Inference_with_Ray_Data_LLM
70+
```
8771

88-
Imagine you have a list of common questions people might ask a chatbot. Instead of responding to each question one by one, you want to generate answers for all the questions at once.
72+
Install the required packages:
8973

90-
#### 1.3.1. Example Questions:
74+
```bash
75+
pip install -r requirements.txt
76+
```
9177

92-
1. "What is the weather like today?"
93-
2. "Tell me a joke."
94-
3. "Give me a motivational quote."
95-
4. "What is 2+2?"
78+
---
9679

97-
### 1.4. What You Want:
80+
## 📖 Usage
9881

99-
Instead of sending each question to the chatbot separately, you want to batch process them all at once, saving time and computing resources.
82+
To use the batch inference feature, follow these simple steps. First, load your model and prepare your data.
10083

101-
### 1.5. How Ray Data LLM Helps:
84+
### Example Code
10285

103-
* **Batch Processing:** Instead of generating responses one by one, Ray Data LLM lets you process all questions simultaneously.
104-
* **Efficient Scaling:** If you have 1000 questions, the system can distribute the workload across multiple processors, making it much faster.
86+
Here’s a basic example to get you started:
10587

106-
### 1.6. Why Use Ray Data LLM?
88+
```python
89+
from ray import serve
10790

108-
Ray Data LLM integrates seamlessly with existing Ray Data pipelines, allowing for efficient batch inference with LLMs. It enables:
91+
# Initialize Ray
92+
import ray
93+
ray.init()
10994

110-
* High-throughput processing
111-
* Distributed execution
112-
* Integration with OpenAI-compatible endpoints
113-
* Support for popular LLMs like Meta Llama
95+
# Load your LLM
96+
@serve.deployment
97+
def llm_inference(request):
98+
# Logic for LLM inference here
99+
pass
114100

115-
### 1.7. Next Steps
101+
# Deploy the service
102+
serve.run(llm_inference.bind())
103+
```
116104

117-
Ray Data LLM provides a simple and scalable way to perform batch inference using LLMs. Starting from basic chatbot responses to complex sentiment analysis, it enables high-throughput text generation and processing.
105+
---
118106

119-
* Set up Ray and Ray Data on your system.
120-
* Follow the provided code examples to implement the discussed problems.
121-
* Experiment with different models and configurations for better performance.
107+
## 📊 Examples
122108

123-
## 2. Fundamental Level: Generating Responses to Common Questions
109+
Explore practical examples to see how batch inference works with different LLMs. Each example demonstrates a specific feature or use case.
124110

125-
### 2.1. Problem
111+
### Example 1: Sentiment Analysis
126112

127-
Imagine you have a list of common questions that people might ask a chatbot. Instead of generating a response for each question one by one, you want to generate answers for all the questions at once.
113+
Use the LLM to analyze the sentiment of a batch of texts:
128114

129-
#### 2.2. Example Questions:
115+
```python
116+
texts = ["I love this!", "This is terrible."]
117+
results = llm_inference(texts)
118+
print(results)
119+
```
130120

131-
1. "What is the weather like today?"
132-
2. "Tell me a joke."
133-
3. "Give me a motivational quote."
134-
4. "What is 2+2?"
121+
### Example 2: Text Generation
135122

136-
### 2.3. Why Batch Processing?
123+
Generate text based on a prompt:
137124

138-
Instead of sending each question to the chatbot separately, batch processing allows you to process all questions simultaneously, saving time and computing resources.
125+
```python
126+
prompts = ["Once upon a time", "In a galaxy far away"]
127+
generated_texts = llm_inference(prompts)
128+
print(generated_texts)
129+
```
139130

140-
#### 2.4. Real-Life Example
131+
---
141132

142-
Think about a customer support bot that receives thousands of similar questions daily. Instead of answering each one as it arrives, you could collect them and process them all at once, providing quick responses efficiently.
133+
## 🤝 Contributing
143134

144-
## 3. Intermediate Level: Text Generation from User Prompts
135+
Contributions are welcome! Please follow these steps to contribute:
145136

146-
### 3.1. Problem
137+
1. Fork the repository.
138+
2. Create a new branch (`git checkout -b feature-branch`).
139+
3. Make your changes and commit them (`git commit -m 'Add new feature'`).
140+
4. Push to your branch (`git push origin feature-branch`).
141+
5. Create a pull request.
147142

148-
Now, imagine you have a list of user prompts for creative text generation. Instead of generating each text separately, you want to create a pipeline that processes all prompts together.
143+
---
149144

150-
#### 3.2. Example Prompts:
145+
## 📄 License
151146

152-
* "Write a haiku about nature."
153-
* "Generate a short story about space exploration."
154-
* "Summarize the following paragraph..."
147+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.
155148

156-
### 3.3. Why Use Ray Data LLM?
149+
---
157150

158-
* You can use Ray Data to load a large number of prompts at once.
159-
* Ray Data LLM allows you to batch process these prompts with the selected LLM model.
151+
## 📬 Contact
160152

161-
## 4. Advanced Level: Real-World Use Case
153+
For questions or support, feel free to reach out:
162154

163-
### 4.1. Problem
155+
- **GitHub**: [ARYAN555279](https://github.com/ARYAN555279)
156+
- **Email**: your_email@example.com
164157

165-
You have a dataset containing thousands of social media posts. You want to classify the sentiment (positive, negative, neutral) of each post using an LLM.
158+
---
166159

167-
### 4.2. Why Batch Processing?
160+
## 📦 Releases
168161

169-
Performing sentiment analysis on each post individually would take too much time. By batching, you can classify the entire dataset at once.
162+
You can find the latest releases [here](https://github.com/ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM/releases). Make sure to download and execute the necessary files to get started.
170163

171-
#### 4.3. How Ray Data LLM Helps:
164+
[![View Releases](https://img.shields.io/badge/View%20Releases-%20blue)](https://github.com/ARYAN555279/Batch_LLM_Inference_with_Ray_Data_LLM/releases)
172165

173-
* Efficiently loads the dataset.
174-
* Applies the LLM processor to generate sentiment labels.
175-
* Uses distributed processing to handle large volumes of data efficiently.
166+
---
167+
168+
## 🎉 Conclusion
169+
170+
Thank you for checking out the **Batch LLM Inference with Ray Data LLM** repository. We hope this project aids in your journey with large language models. Happy coding!
171+
172+
![Thank You](https://raw.githubusercontent.com/yourusername/yourrepo/main/thank_you.png)
173+
```

0 commit comments

Comments
 (0)