Skip to content

Commit 93e10e6

Browse files
authored
Add judge-builder repo into sandbox for DBX Marketplace listing (#600)
As titled. This is nearly a direct copy of the existing `judge-builder` repo here: https://github.com/databricks-solutions/judge-builder Followed the guide for contributing [here](https://github.com/databrickslabs/sandbox/blob/main/CONTRIBUTING.md). - [x] Added code owners - [x] Created the README in the expected format - [x] Followed best practices for Databricks Apps on Marketplace - [x] Completed a SDR (SDR-3817)
1 parent ada7d55 commit 93e10e6

File tree

154 files changed

+23767
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+23767
-0
lines changed

CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ downstreams @nfx @alexott
1313
feature-registry-app @yang-chengg @mparkhe @mingyangge-db @stephanielu5
1414
go-libs @nfx @alexott
1515
ip_access_list_analyzer @alexott
16+
judge-builder @smoorjani @alkispoly-db
1617
ka-chat-bot @taiga-db
1718
metascan @nfx @alexott
1819
runtime-packages @nfx @alexott

judge-builder/.gitignore

Lines changed: 310 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,310 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[cod]
4+
*.pyc
5+
*$py.class
6+
7+
# C extensions
8+
*.so
9+
10+
# Distribution / packaging
11+
.Python
12+
develop-eggs/
13+
dist/
14+
# Note: client/build/ is checked in for Databricks Apps deployment
15+
downloads/
16+
eggs/
17+
.eggs/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
share/python-wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
MANIFEST
28+
29+
# PyInstaller
30+
# Usually these files are written by a python script from a template
31+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
32+
*.manifest
33+
*.spec
34+
35+
# Installer logs
36+
pip-log.txt
37+
pip-delete-this-directory.txt
38+
39+
# Unit test / coverage reports
40+
htmlcov/
41+
.tox/
42+
.nox/
43+
.coverage
44+
.coverage.*
45+
.cache
46+
nosetests.xml
47+
coverage.xml
48+
*.cover
49+
*.py,cover
50+
.hypothesis/
51+
.pytest_cache/
52+
cover/
53+
54+
# Translations
55+
*.mo
56+
*.pot
57+
58+
# Django stuff:
59+
*.log
60+
local_settings.py
61+
db.sqlite3
62+
db.sqlite3-journal
63+
64+
# Flask stuff:
65+
instance/
66+
.webassets-cache
67+
68+
# Scrapy stuff:
69+
.scrapy
70+
71+
# Sphinx documentation
72+
docs/_build/
73+
74+
# PyBuilder
75+
.pybuilder/
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
# For a library or package, you might want to ignore these files since the code is
87+
# intended to run in multiple environments; otherwise, check them in:
88+
# .python-version
89+
90+
# pipenv
91+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
93+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
94+
# install all needed dependencies.
95+
#Pipfile.lock
96+
97+
# poetry
98+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99+
# This is especially recommended for binary packages to ensure reproducibility, and is more
100+
# commonly ignored for libraries.
101+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102+
#poetry.lock
103+
104+
# pdm
105+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106+
#pdm.lock
107+
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108+
# in version control.
109+
# https://pdm.fming.dev/#use-with-ide
110+
.pdm.toml
111+
112+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
113+
__pypackages__/
114+
115+
# Celery stuff
116+
celerybeat-schedule
117+
celerybeat.pid
118+
119+
# SageMath parsed files
120+
*.sage.py
121+
122+
# Environments
123+
.env
124+
.env.local
125+
.venv
126+
env/
127+
venv/
128+
ENV/
129+
env.bak/
130+
venv.bak/
131+
132+
# Spyder project settings
133+
.spyderproject
134+
.spyproject
135+
136+
# Rope project settings
137+
.ropeproject
138+
139+
# mkdocs documentation
140+
/site
141+
142+
# mypy
143+
.mypy_cache/
144+
.dmypy.json
145+
dmypy.json
146+
147+
# Pyre type checker
148+
.pyre/
149+
150+
# pytype static type analyzer
151+
.pytype/
152+
153+
# Cython debug symbols
154+
cython_debug/
155+
156+
# PyCharm
157+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
158+
# be added to the global gitignore or merged into this project gitignore. For a PyCharm
159+
# project, it is recommended to comment out the line below to share the project gitignore.
160+
.idea/
161+
162+
# Node.js dependencies
163+
node_modules/
164+
npm-debug.log*
165+
yarn-debug.log*
166+
yarn-error.log*
167+
pnpm-debug.log*
168+
lerna-debug.log*
169+
170+
# Runtime data
171+
pids
172+
*.pid
173+
*.seed
174+
*.pid.lock
175+
176+
# Coverage directory used by tools like istanbul
177+
coverage/
178+
*.lcov
179+
180+
# nyc test coverage
181+
.nyc_output
182+
183+
# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
184+
.grunt
185+
186+
# Bower dependency directory (https://bower.io/)
187+
bower_components
188+
189+
# node-waf configuration
190+
.lock-wscript
191+
192+
# Compiled binary addons (https://nodejs.org/api/addons.html)
193+
build/Release
194+
195+
# Dependency directories
196+
jspm_packages/
197+
198+
# Snowpack dependency directory (https://snowpack.dev/)
199+
web_modules/
200+
201+
# TypeScript cache
202+
*.tsbuildinfo
203+
204+
# Optional npm cache directory
205+
.npm
206+
207+
# Optional eslint cache
208+
.eslintcache
209+
210+
# Optional stylelint cache
211+
.stylelintcache
212+
213+
# Microbundle cache
214+
.rpt2_cache/
215+
.rts2_cache_cjs/
216+
.rts2_cache_es/
217+
.rts2_cache_umd/
218+
219+
# Optional REPL history
220+
.node_repl_history
221+
222+
# Output of 'npm pack'
223+
*.tgz
224+
225+
# Yarn Integrity file
226+
.yarn-integrity
227+
228+
# dotenv environment variable files
229+
.env.development.local
230+
.env.test.local
231+
.env.production.local
232+
.env.local
233+
234+
# parcel-bundler cache (https://parceljs.org/)
235+
.cache
236+
.parcel-cache
237+
238+
# Next.js build output
239+
.next
240+
out
241+
242+
# Nuxt.js build / generate output
243+
.nuxt
244+
dist
245+
246+
# Gatsby files
247+
.cache/
248+
public
249+
250+
# Vite build output
251+
dist
252+
dist-ssr
253+
*.local
254+
255+
# Rollup build output
256+
dist/
257+
258+
# Ruff
259+
.ruff_cache/
260+
261+
# Storybook build outputs
262+
.out
263+
.storybook-out
264+
storybook-static
265+
266+
# Temporary folders
267+
tmp/
268+
temp/
269+
270+
# Editor directories and files
271+
.vscode
272+
.vscode/*
273+
!.vscode/extensions.json
274+
.idea
275+
.DS_Store
276+
*.suo
277+
*.ntvs*
278+
*.njsproj
279+
*.sln
280+
*.sw?
281+
282+
# OS generated files
283+
.DS_Store
284+
.DS_Store?
285+
._*
286+
.Spotlight-V100
287+
.Trashes
288+
ehthumbs.db
289+
Thumbs.db
290+
291+
# Application specific
292+
mlruns/
293+
mlartifacts/
294+
295+
# Databricks
296+
.databricks/
297+
298+
# UV
299+
.uv/
300+
301+
# Generated client
302+
client/src/client/
303+
304+
# Generated requirements files (keep requirements.txt for deployment)
305+
dev-requirements.txt
306+
307+
client/node_modules/
308+
client/package-lock.json
309+
.env.local
310+
tmp-deploy-logs.txt

judge-builder/CHANGELOG.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# CHANGELOG
2+
3+
## 0.3.4 (2025-10-28)
4+
5+
### Security Fixes
6+
7+
- Fixed path traversal vulnerability in static file serving that could allow access to files outside the build directory
8+
- Restricted CORS middleware to development mode only (controlled by `DEPLOYMENT_MODE` environment variable)
9+
10+
## 0.3.3 (2025-10-18)
11+
12+
### Bug Fixes
13+
14+
- Fixed alignment timeout issues in deployed environments by moving alignment to background tasks with polling
15+
16+
## 0.3.2 (2025-10-17)
17+
18+
### Improvements
19+
20+
- Changed default alignment model back to databricks-hosted endpoint
21+
- Removed alignment model selector from judge creation form
22+
- Upgraded databricks-agents to >=1.8.0
23+
24+
### Bug Fixes
25+
26+
- Fixed duplicate traces being added to labeling sessions
27+
28+
## 0.3.1 (2025-10-09)
29+
30+
### Improvements
31+
32+
- **Client Attribution**: Added monkey patch to set custom client name (`judge-builder-v{VERSION}`) in chat completions for better tracking and analytics
33+
34+
## 0.3.0 (2025-10-03)
35+
36+
### Features
37+
38+
- **Alignment Model Selector**: Introduced a model selector for the alignment "teacher" model, allowing users to choose custom serving endpoints
39+
40+
### Improvements
41+
42+
- Made alignment limit clear with improved UI messaging about the 10-example requirement
43+
- Reduced logging bloat in judge builder and fixed watch.sh script
44+
- Fixed bug in iterative alignments where evaluations were not running on the full set of traces, causing incorrect cached results to be used
45+
46+
## 0.2.1 (2025-09-24)
47+
48+
### Improvements
49+
50+
- Cache results of schema analysis to get potential output types
51+
52+
## 0.2.0 (2025-09-23)
53+
54+
### Features
55+
56+
- **MLflow Integration**: Integrated `make_judge` and `align` functions from MLflow 3.4.0 for enhanced judge creation and alignment capabilities
57+
- **Expanded Judge Support**: Added support for judges over expectations or traces (e.g., correctness judges), extending beyond the original use cases
58+
- **Arbitrary Output Support**: Enhanced judge outputs to support arbitrary categorical values instead of being limited to just pass/fail binary outcomes
59+
60+
### Improvements
61+
62+
- Enhanced alignment visualization with improved metrics display and filtering options
63+
- Added version comparison indicators showing agreement/disagreement progression
64+
- Improved UI layout for better alignment results presentation
65+
66+
## 0.1.0 (2025-08-26)
67+
68+
Initial version of Judge Builder

0 commit comments

Comments
 (0)