This is an online supplement for the paper "Empirical Hardness of Finding Optimal Bayesian Network Structures", B. Malone, K. Kangas, M. Järvisalo, M. Koivisto, P. Myllymäki (under revision). We study the empirical hardness of learning an optimal structure of a Bayesian network from statistical data via score-based methods. Specifically, we evaluate three state-of-the-art approaches or "solvers" for learning optimal structures on a collection of benchmark data. We then learn models for predicting the runtime of each solver on a given problem instance, based on various instance features, and use the models to construct algorithm portfolios.
This supplement provides the experiment data, including the benchmark data sets used, the feature values of all instances, the predictive models learned, as well as the measured and predicted runtimes of solvers. We have also made our data available as a scenario in the ASlib algorithm selection library.
Supplemental material for the preliminary AAAI-14 version of this work can be found here.
The table contains one line for each BNSL instance, and each line has the following attributes:
name | name of the instance | |
dataset | name of the source dataset | |
category | dataset class, one of: real, sampled, synthetic | |
var | number of variables | |
rec | number of records in the source dataset | |
f | scoring function, one of: bdeu, bic | |
ess | equivalent sample size (? for BIC scores) | |
pl | parent limit | |
[solver] | (8 attributes) | running time of [solver], either a number or "to" (timeout) or "mo" (memout) |
[feature] | (86 attributes) | value of [feature] |
[feature-set].[model-class]_[solver]_prediction | (144 attributes) | predicted runtime of [solver] by the predictor trained using [model-class] and [feature-set] |