File tree Expand file tree Collapse file tree 1 file changed +15
-1
lines changed Expand file tree Collapse file tree 1 file changed +15
-1
lines changed Original file line number Diff line number Diff line change @@ -18,7 +18,21 @@ source venv/bin/activate
1818pip3 install -r requirements.txt
1919```
2020
21- # Reproduce Experiments
21+ # How to experiment with DeepDB on a new Dataset
22+ - Specify a new schema in the schemas folder
23+ - Due to the current implementation, make sure to declare
24+ - the primary key,
25+ - the filename of the csv sample file,
26+ - the correct table size and sample rate,
27+ - the relationships among tables if you do not just run queries over a single table,
28+ - any non-key functional dependencies (this is rather an implementation detail),
29+ - and include all columns in the no-compression list by default (as done for the IMDB benchmark),
30+ - To further reduce the training time, you can exclude columns you do not need in your experiments (also done in the IMDB benchmark)
31+ - Generate the HDF/sampled HDF files and learn the RSPN ensemble
32+ - Use the RSPN ensemble to answer queries
33+ - For reference, please check the commands to reproduce the results of the paper
34+
35+ # How to Reproduce Experiments in the Paper
2236
2337## Cardinality Estimation
2438Download the [ Job dataset] ( http://homepages.cwi.nl/~boncz/job/imdb.tgz ) .
You can’t perform that action at this time.
0 commit comments