realKD

discovering real knowledge from real data for real users

  • Home
  • Software
    • realKD library
    • Creedo
  • About
    • realKD.org
    • The Author

Mario Boley

mario small I am a lecturer at Monash University in Australia. My work is in advancing tools for human-centred data analysis. In contrast to machine learning methods that aim for automation (e.g., self-driving cars), human-centred data analysis methods aim to empower human users. In order to be effective, such methods and their results must be simple and interpretable yet provide valid answers to critical analysis questions.

I am interested in all aspects of data analysis methods: learning-theoretic foundations, efficient algorithms, and concrete applications. For the latter, I particularly focus on supporting scientific discoveries. For example, I collaborate with materials science researchers to pursue the data-driven discovery of novel functional materials.

Experience

Before moving to Australia, I was in Germany at the Max Planck Institute for Informatics, Saarbrücken (2015-2018), the Fritz Haber Institute of the Max Planck Society, Berlin (2015-2017), the University of Bonn (2011-2015), and Fraunhofer IAIS, Sankt Augustin (2007-2015).

I was also a visiting scholar at the Technion in Haifa, Israel (2012 and 2013), and University of Antwerp in Belgium (2011).

Expertise

Subgroup discovery: efficient search algorithms using tight optimistic estimators [1] and redundancy-awareness [15]; applications in materials science [2] and election analysis [12].

General data mining algorithms: efficient fix-point enumeration of closure operators [16], in particular for operators defined on restricted set systems [13] such as for multi-relational pattern discovery [5]; counting and sampling protocols based on the Markov chain Monte Carlo method [17, 14], direct sampling methods [11], and coupling from the past [10].

Machine learning: communication-efficient online learning in distributed settings [6, 3]; structured output prediction [18]; applications to cooperate earnings predictions [7], privacy-preserving mobility monitoring [8], and intelligent knowledge discovery systems [9] using preference learning and multi-armed bandit algorithms for exploitation/exploration as well as on the empirical evaluation of such systems with real users [4].

Selected Publications

[1] M. Boley, B. R. Goldsmith, L. M. Ghiringhelli, and J. Vreeken, “Identifying consistent statements about numerical data with dispersion-corrected subgroup discovery,” Data mining and knowledge discovery, 2017.
[Bibtex] [Paper] [Slides] [Link]
@article{boley2017identifying,
author="Boley, Mario
and Goldsmith, Bryan R.
and Ghiringhelli, Luca M.
and Vreeken, Jilles",
title="Identifying consistent statements about numerical data with dispersion-corrected subgroup discovery",
journal="Data Mining and Knowledge Discovery",
year="2017",
month="Jun",
day="28",
doi="10.1007/s10618-017-0520-3",
}
[2] B. R. Goldsmith, M. Boley, J. Vreeken, M. Scheffler, and L. M. Ghiringhelli, “Uncovering structure-property relationships of materials by subgroup discovery,” New journal of physics, vol. 19, iss. 1, p. 013–031, 2017.
[Bibtex] [Paper]
@article{goldsmith2017uncovering,
title={Uncovering structure-property relationships of materials by subgroup discovery},
author={Goldsmith, Bryan R and Boley, Mario and Vreeken, Jilles and Scheffler, Matthias and Ghiringhelli, Luca M},
journal={New Journal of Physics},
volume={19},
number={1},
pages={013--031},
year={2017},
publisher={IOP Publishing}
}
[3] M. Kamp, S. Bothe, M. Boley, and M. Mock, “Communication-efficient distributed online learning with kernels,” in Joint european conference on machine learning and knowledge discovery in databases, 2016, p. 805–819.
[Bibtex]
@inproceedings{kamp2016communication,
title={Communication-Efficient Distributed Online Learning with Kernels},
author={Kamp, Michael and Bothe, Sebastian and Boley, Mario and Mock, Michael},
booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
pages={805--819},
year={2016},
organization={Springer International Publishing}
}
[4] M. Boley, M. Krause-Traudes, B. Kang, and B. Jacobs, “Creedo–-scalable and repeatable extrinsic evaluation for pattern discovery systems by online user studies,” in Acm sigkdd workshop on interactive data exploration and analytics (idea), 2015, p. 20–28.
[Bibtex] [Paper]
@inproceedings{boley2015creedo,
title={Creedo---Scalable and Repeatable Extrinsic Evaluation for Pattern Discovery Systems by Online User Studies},
author={Boley, Mario and Krause-Traudes, Maike and Kang, Bo and Jacobs, Bj{\"o}rn},
booktitle={ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA)},
pages={20--28},
year={2015},
organization={ACM}
}
[5] E. Spyropoulou, T. De Bie, and M. Boley, “Interesting pattern mining in multi-relational data,” Data mining and knowledge discovery, vol. 28, iss. 3, p. 808–849, 2014.
[Bibtex]
@article{spyropoulou2014interesting,
title={Interesting pattern mining in multi-relational data},
author={Spyropoulou, Eirini and De Bie, Tijl and Boley, Mario},
journal={Data Mining and Knowledge Discovery},
volume={28},
number={3},
pages={808--849},
year={2014},
publisher={Springer}
}
[6] M. Kamp, M. Boley, D. Keren, A. Schuster, and I. Sharfman, “Communication-efficient distributed online prediction by dynamic model synchronization,” in Europ. conf. on machine learning and knowledge discovery in databases (ecml/pkdd), 2014, pp. 623-639.
[Bibtex]
@inproceedings{kamp2014distributed,
year={2014},
booktitle={Europ. Conf. on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD)},
title={Communication-Efficient Distributed Online Prediction by Dynamic Model Synchronization},
author={Kamp, Michael and Boley, Mario and Keren, Daniel and Schuster, Assaf and Sharfman, Izchak},
pages={623-639},
}
[7] M. Kamp, M. Boley, and T. Gärtner, “Beating human analysts in nowcasting corporate earnings by using publicly available stock price and correlation features,” in Siam int. conf. on data mining (sdm), 2014.
[Bibtex]
@inproceedings{kamp2014beating,
title={Beating Human Analysts in Nowcasting Corporate Earnings by using Publicly Available Stock Price and Correlation Features},
booktitle={SIAM Int. Conf. on Data Mining (SDM)},
author={Kamp, Michael and Boley, Mario and G{\"a}rtner, Thomas},
publisher={SIAM},
year={2014}
}
[8] M. Kamp, C. Kopp, M. Mock, M. Boley, and M. May, “Privacy-preserving mobility monitoring using sketches of stationary sensor readings,” in Europ. conf. on machine learning and knowledge discovery in databases (ecml/pkdd), Springer, 2013, p. 370–386.
[Bibtex]
@incollection{kamp2013privacy,
title={Privacy-Preserving Mobility Monitoring Using Sketches of Stationary Sensor Readings},
author={Kamp, Michael and Kopp, Christine and Mock, Michael and Boley, Mario and May, Michael},
booktitle={Europ. Conf. on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD)},
pages={370--386},
year={2013},
publisher={Springer}
}
[9] M. Boley, M. Mampaey, B. Kang, P. Tokmakov, and S. Wrobel, “One click mining: interactive local pattern discovery through implicit preference and performance learning,” in Acm sigkdd workshop on interactive data exploration and analytics (idea), 2013, p. 27–35.
[Bibtex] [Paper]
@inproceedings{boley2013one,
title={One click mining: interactive local pattern discovery through implicit preference and performance learning},
author={Boley, Mario and Mampaey, Michael and Kang, Bo and Tokmakov, Pavel and Wrobel, Stefan},
booktitle={ACM SIGKDD Workshop on Interactive Data Exploration and Analytics (IDEA)},
pages={27--35},
year={2013},
organization={ACM}
}
[10] M. Boley, S. Moens, and T. Gärtner, “Linear space direct pattern sampling using coupling from the past,” in Acm sigkdd int. conf. on knowledge discovery and data mining (kdd), 2012, p. 69–77.
[Bibtex]
@inproceedings{boley2012linear,
title={Linear space direct pattern sampling using coupling from the past},
author={Boley, Mario and Moens, Sandy and G{\"a}rtner, Thomas},
booktitle={ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD)},
pages={69--77},
year={2012},
organization={ACM}
}
[11] M. Boley, C. Lucchese, D. Paurat, and T. Gärtner, “Direct local pattern sampling by efficient two-step random procedures,” in Acm sigkdd int. conf. on knowledge discovery and data mining (kdd), 2011, p. 582–590.
[Bibtex]
@inproceedings{boley2011direct,
title={Direct local pattern sampling by efficient two-step random procedures},
author={Boley, Mario and Lucchese, Claudio and Paurat, Daniel and G{\"a}rtner, Thomas},
booktitle={ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD)},
pages={582--590},
year={2011},
organization={ACM}
}
[12] H. Grosskreutz, M. Boley, and M. Krause-Traudes, “Subgroup discovery for election analysis: a case study in descriptive data mining,” in Discovery science (ds), 2010, p. 57–71.
[Bibtex]
@inproceedings{grosskreutz2010subgroup,
title={Subgroup discovery for election analysis: a case study in descriptive data mining},
author={Grosskreutz, Henrik and Boley, Mario and Krause-Traudes, Maike},
booktitle={Discovery Science (DS)},
pages={57--71},
year={2010},
organization={Springer}
}
[13] M. Boley, T. Horváth, A. Poigné, and S. Wrobel, “Listing closed sets of strongly accessible set systems with applications to data mining,” Theoretical computer science, vol. 411, iss. 3, p. 691–700, 2010.
[Bibtex]
@article{boley2010listing,
title={Listing closed sets of strongly accessible set systems with applications to data mining},
author={Boley, Mario and Horv{\'a}th, Tam{\'a}s and Poign{\'e}, Axel and Wrobel, Stefan},
journal={Theoretical computer science},
volume={411},
number={3},
pages={691--700},
year={2010},
publisher={Elsevier}
}
[14] M. Boley, T. Gärtner, and H. Grosskreutz, “Formal concept sampling for counting and threshold-free local pattern mining.,” in Siam int. conf. on data mining (sdm), 2010, p. 177–188.
[Bibtex]
@inproceedings{boley2010formal,
title={Formal Concept Sampling for Counting and Threshold-Free Local Pattern Mining.},
author={Mario Boley and Thomas G{\"a}rtner and Henrik Grosskreutz},
booktitle={SIAM Int. Conf. on Data Mining (SDM)},
pages={177--188},
year={2010},
organization={SIAM}
}
[15] M. Boley and H. Grosskreutz, “Non-redundant subgroup discovery using a closure system,” Machine learning and knowledge discovery in databases, p. 179–194, 2009.
[Bibtex] [Paper] [Link]
@article{boley2009non,
title={Non-redundant subgroup discovery using a closure system},
author={Boley, Mario and Grosskreutz, Henrik},
journal={Machine Learning and Knowledge Discovery in Databases},
pages={179--194},
year={2009},
publisher={Springer},
doi={10.1007/978-3-642-04180-8_29}
}
[16] M. Boley, T. Horváth, and S. Wrobel, “Efficient discovery of interesting patterns based on strong closedness,” Statistical analysis and data mining, vol. 2, iss. 5-6, p. 346–360, 2009.
[Bibtex]
@article{boley2009efficient,
title={Efficient discovery of interesting patterns based on strong closedness},
author={Boley, Mario and Horv{\'a}th, Tam{\'a}s and Wrobel, Stefan},
journal={Statistical Analysis and Data Mining},
volume={2},
number={5-6},
pages={346--360},
year={2009},
publisher={Wiley Online Library}
}
[17] M. Boley and H. Grosskreutz, “Approximating the number of frequent sets in dense data,” Knowledge and information systems, vol. 21, iss. 1, p. 65–89, 2009.
[Bibtex]
@article{boley2009approximating,
title={Approximating the number of frequent sets in dense data},
author={Boley, Mario and Grosskreutz, Henrik},
journal={Knowledge and information systems},
volume={21},
number={1},
pages={65--89},
year={2009},
publisher={Springer}
}
[18] S. Vembu, T. Gärtner, and M. Boley, “Probabilistic structured predictors,” in Conf. on uncertainty in artificial intelligence (uai), 2009, p. 557–564.
[Bibtex]
@inproceedings{vembu2009probabilistic,
title={Probabilistic structured predictors},
author={Vembu, Shankar and G{\"a}rtner, Thomas and Boley, Mario},
booktitle={Conf. on Uncertainty in Artificial Intelligence (UAI)},
pages={557--564},
year={2009},
organization={AUAI Press}
}
  • Share

    Facebooktwittergoogle_pluslinkedinmail
  • Featured Papers

    • Numeric subgroup discovery
    • Subgroup discovery for materials
    • Creedo
  • Links

    • Creedo on Bitbucket
    • realKD on Bitbucket
  • Search

Proudly powered by WordPress Theme: Parament by Automattic.
  • Contact
 Copyright © 2014-2023 Mario Boley