Skip to content

Commit 7fb94fa

Browse files
committed
Adding part 2
1 parent 40bbd50 commit 7fb94fa

File tree

1 file changed

+161
-11
lines changed

1 file changed

+161
-11
lines changed

readme.md

Lines changed: 161 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,7 @@ Your task is to implement `grow()`, which:
191191
Once implemented, your `grow()` function will be the engine that drives bottom-up search over the DSL, which step by step builds increasingly complex shapes.
192192

193193
``` python
194+
# TODO
194195
def grow(
195196
self,
196197
program_list: List[Shape],
@@ -210,13 +211,15 @@ Now that you can **grow** shapes, the next challenge is to keep your search spac
210211
For this, we’ll turn to the more general `BottomUpSynthesizer` (in `enumerative_synthesis.py`) and implement a pruning step: **eliminating observationally equivalent programs**.
211212

212213
Two programs are **observationally equivalent** if they produce the **same outputs** on the **same inputs**. For example, look at these two programs:
213-
* `Union(Circle(center=(0,0), r=1), Circle(center=(0,0), r=1))`
214-
* `Circle(center=(0,0), r=1)`
215-
These two programs are *different syntactically* but *indistinguishable observationally* (their outputs match on all test points).
214+
* `Union(Circle(0,0,1), Circle(0,0,1))`
215+
* `Circle(0,0,1)`
216+
217+
These two programs are *different syntactically* but *indistinguishable observationally* (their outputs match on all given test points).
216218

217219
Your job is to filter out duplicates like these so the synthesizer only keeps *unique behaviors*. Please implement the `eliminate_equivalents` function:
218220

219221
```python
222+
# TODO
220223
def eliminate_equivalents(
221224
self,
222225
program_list: List[T],
@@ -247,7 +250,12 @@ Now it’s time to put everything together! You’ve implemented **growing** sha
247250
Head to the function `synthesize()` in `BottomUpSynthesizer` under the file `enumerative_synthesis.py`:
248251

249252
```python
250-
def synthesize(self, examples: List[Any], max_iterations: int = 5) -> T:
253+
# TODO
254+
def synthesize(
255+
self,
256+
examples: List[Any],
257+
max_iterations: int = 5
258+
) -> T:
251259
```
252260

253261
The **bottom-up synthesis loop** works like this:
@@ -260,17 +268,17 @@ The **bottom-up synthesis loop** works like this:
260268

261269
* **Grow**: expand the program set one level deeper using `grow()`.
262270
* **Eliminate equivalents**: prune duplicates with `eliminate_equivalents()`.
263-
* **Check for success**: after pruning, see if any program satisfies all examples using `is_correct()`.
271+
* **Check for success**: while pruning (remember that we are using `yield`), see if any program satisfies all examples using `is_correct()`.
264272
* If yes → 🎉 return that program immediately.
265273
* Otherwise → continue expanding.
266274

267275
If no solution is found after `max_iterations`, you may `raise ValueError` (which is already provided).
268276

269277
> 💡 Hints & Tips
270278
>
271-
> * Use the helper functions! Don’t reinvent wheels`generate_terminals`, `grow`, `eliminate_equivalents`, and `is_correct` are there to help.
272-
> * Manage your **cache** carefully—without caching signatures, performance will tank as you repeatedly re-evaluate the same programs.
273-
> * When calling `eliminate_equivalents`, remember that the **test inputs** are just the `(x, y)` coordinates from the examples.
279+
> * Use the helper functions! Don’t reinvent wheels. `generate_terminals`, `grow`, `eliminate_equivalents`, and `is_correct` are there to help.
280+
> * Manage your **cache** carefully. Without caching signatures, performance will tank as you repeatedly re-evaluate the same programs.
281+
> * When calling `eliminate_equivalents`, remember that the **test inputs** (`test_inputs`) are just the $(x, y)$ coordinates from the examples.
274282
275283
### 🎁 Wrapping Up Part 1
276284

@@ -287,8 +295,10 @@ python test_part1.py
287295
```
288296

289297
If all goes well, you’ll see your synthesizer solving the test cases one by one.
298+
The goal right now would be to make it faster.
290299

291-
For reference, our solution takes about **30 seconds** to pass all test cases on a MacBook Pro with an Apple M1 Pro chip and Python 3.13.0:
300+
Make sure your synthesizer can pass all test cases within 30 minutes (paralleled) on GradeScope.
301+
For reference, our solution takes about **30 seconds** to pass all test cases (sequentially) on a MacBook Pro with an Apple M1 Pro chip and Python 3.13.0:
292302

293303
```bash
294304
(.venv) ziyang@macbookpro assignment-1> python --version
@@ -305,8 +315,148 @@ Executed in 28.47 secs fish external
305315

306316
✨ Congratulations—you’ve just built your first **bottom-up program synthesizer**!
307317

308-
# Part 2: Bottom-up Synthesis for Strings
318+
# 🧵 Part 2: Bottom-up Synthesis for Strings
319+
320+
Now that you’ve conquered **shapes**, it’s time to switch gears and build a synthesizer for **string processing**.
321+
322+
A string processing program $f$ is simply a function:
323+
324+
$$
325+
f : \texttt{string} \mapsto \texttt{string}
326+
$$
327+
328+
For example, Python’s built-in `.strip()` removes whitespace from both ends of a string:
329+
330+
```python
331+
" hello ".strip() ==> "hello"
332+
"ha ha ".strip() ==> "ha ha"
333+
" 100".strip() ==> "100"
334+
```
335+
336+
Just like with shapes, we can **chain operations** to build more powerful transformations.
337+
338+
### 🧩 The String DSL
339+
340+
Instead of starting with a fixed DSL, you will **design your own** string processing DSL. We provide you with two terminals and one operation:
341+
342+
```
343+
StringExpr ::= InputString # the program input string
344+
| StringLiteral(String) # a literal string, e.g. "hello"
345+
| Concatenate(StringExpr, StringExpr) # combine two expressions
346+
| ... # your new operations here!
347+
```
348+
349+
Your job is to extend this DSL with enough expressive power to solve the test cases.
350+
351+
### 🔍 Understanding the DSL
352+
353+
Each string operation is represented as a Python class that inherits from `StringExpression`. For example, here’s the provided `Concatenate`:
354+
355+
```python
356+
class Concatenate(StringExpression):
357+
def __init__(self, left: StringExpression, right: StringExpression):
358+
self.left = left
359+
self.right = right
360+
361+
def interpret(self, input_string: str) -> str:
362+
return self.left.interpret(input_string) + self.right.interpret(input_string)
363+
```
364+
365+
Key points:
366+
367+
* `__init__`: defines the *syntax* of the operation (what arguments it takes).
368+
* `interpret`: defines the *semantics* (how to evaluate the operation).
369+
* Implementing `__eq__`, `__hash__`, and `__str__` is highly recommended for debugging and deduplication.
370+
371+
> 👉 Pro tip: don’t handwrite too much boilerplate — LLMs are great at generating these helper methods.
372+
373+
### ✅ Testing Your Solution
374+
375+
Before you begin, run the test script:
376+
377+
```bash
378+
python test_part2.py
379+
```
380+
381+
You’ll see all test cases fail initially:
382+
383+
```
384+
test_get_parent_directory ✗ FAIL
385+
test_extract_directory_path ✗ FAIL
386+
test_normalize_path_separators ✗ FAIL
387+
388+
Summary:
389+
Total test cases: 25
390+
Synthesis succeeded: 0
391+
Fully correct: 0
392+
Success rate: 0.0%
393+
Accuracy rate: 0.0%
394+
```
395+
396+
Your goal: design your DSL and synthesizer so these tests pass!
397+
398+
### 🎯 Your Task
399+
400+
1. **Extend the DSL** with new operations (Part 2a).
401+
2. **Implement the grow function** so the synthesizer can explore programs using your DSL (Part 2b).
402+
403+
### Part 2(a): Creating Your Own DSL
404+
405+
Design new operations and add them as classes under `strings.py`.
406+
407+
> **Hints:**
408+
>
409+
> * Look at common string functions in Python/Java/C++ for inspiration (e.g., `substring`, `replace`, `find`).
410+
> * Check the test cases in `test_part2.py` and ask yourself: *What minimal set of operations can solve all of them?*
411+
> * Start small! Tackle easy test cases first, then add operators as needed.
412+
413+
### Part 2(b): Growing String Expressions
414+
415+
Now look at the `grow()` function in `StringSynthesizer` (`string_synthesizer.py`).
416+
This should work just like Part 1, except with string operations.
417+
418+
> **Hints:**
419+
>
420+
> * For operations that require extra integer arguments (e.g., substring indices), you can use the provided `common_indices = [0, 1, 2, ...]` as candidate constants.
421+
> * **Pruning is essential**. Without pruning, your search will explode. Examples:
422+
> * For `substring(str, start, end)`, skip invalid cases like `start > end`.
423+
> * When using `StringLiteral`, only allow literals that actually appear in one of the example outputs.
424+
> * The more carefully you prune, the faster your synthesizer will run.
425+
426+
### ⚡ Expected Outcome
427+
428+
When everything is working, your synthesizer should solve **all 25 provided test cases** in `test_part2.py`.
429+
430+
For reference, our solution takes about **100 seconds** sequentially on a MacBook Pro (M1 Pro, Python 3.13.0).
431+
432+
```bash
433+
(.venv) ziyang@macbookpro assignment-1> time python test_part2.py
434+
...
435+
Summary:
436+
Total test cases: 25
437+
Synthesis succeeded: 25
438+
Fully correct: 25
439+
Success rate: 100.0%
440+
Accuracy rate: 100.0%
309441

442+
________________________________________________________
443+
Executed in 102.80 secs fish external
444+
usr time 100.87 secs 151.00 micros 100.87 secs
445+
sys time 1.25 secs 757.00 micros 1.25 secs
446+
```
310447

448+
✅ You will receive full credit if your synthesizer finishes all test cases within **30 minutes (parallelized)**.
449+
450+
---
451+
452+
### 🚨 Note on Hard Test Cases
453+
454+
There are additional **hard test cases** (see `get_hard_test_cases`) that we do not expect your DSL + synthesizer to solve (at least not without exponential blowup).
455+
456+
You can run them with:
457+
458+
```bash
459+
python test_part2.py --hard
460+
```
311461

312-
# Part 3: LLM Synthesis for Strings
462+
Feel free to try, but beware of the combinatorial explosion.

0 commit comments

Comments
 (0)