Blackbox

Data Collection Project

  1. Overview
  2. Development Team
  3. List of Publications

Overview

The BlueJ Blackbox data collection project was an initiative by the developers of BlueJ to collect data on how BlueJ was used, in order to increase understanding of how students learn to program. The data collected was for the purposes of academic research, and was meant only to be used by computing education researchers.

As of January 2026, the project has been stopped and therefore no access to the data is now granted, existing access will be revoked in August 2026.

Development Team

      Michael Kölling
      Ian Utting
      Davin McCall
      Neil Brown
      Amjad Altadmri
      Hamza Hamza

List of Publications

Neil C. C. Brown, Amjad Altadmri, Sue Sentance, and Michael Kölling. 2018. Blackbox, Five Years On: An Evaluation of a Large-scale Programming Data Collection Project. In Proceedings of the 2018 ACM Conference on International Computing Education Research (ICER '18). ACM, New York, NY, USA, 196-204. DOI: https://doi.org/10.1145/3230977.3230991
Becker, B. A., Murray, C., Tao, T., Song, C., McCartney, R., and Sanders, K., Fix the First, Ignore the Rest: Dealing with Multiple Compiler Error Messages, Proceedings of the 49th ACM Technical Symposium on Computer Science Education (SIGCSE '18). ACM, New York, NY, USA, 634-639, 2018. DOI: https://doi.org/10.1145/3159450.3159453
Mirza, O. M., Joy, M., and Cosma, G., Suitability of BlackBox dataset for style analysis in detection of source code plagiarism, Seventh International Conference on Innovative Computing Technology (INTECH), Luton, pp. 90-94, 2017. DOI: https://10.1109/INTECH.2017.8102424
Brown, N. C. C. and Altadmri, A., Novice Java Programming Mistakes: Large-Scale Data vs. Educator Beliefs, Trans. Comput. Educ. Volume 17, issue 2, Article 7 , 21 pages, 2017. DOI: https://doi.org/10.1145/2994154
Keuning, H., Heeren, B., and Jeuring, J., Code quality issues in student programs, Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education, ser. ITiCSE ’17. New York, NY, USA: ACM, pp. 110–115, 2017. DOI: https://doi.org/10.1145/3059009.3059061
McCall,D., Kölling, M., Meaningful categorisation of novice programmer errors. In Frontiers In Education Conference , pages 2589-2596, 2014. DOI: https://10.1109/FIE.2014.7044420
Murray, C., A Comparative Study of Java Compiler Error Profiles Using the Blackbox Dataset, Master's thesis, University College Dublin, 2016.
Altadmri, A., and Brown, N.C.C., Researching Programming Education with Blackbox (Abstract Only) , In Proceedings of the 47th ACM Technical Symposium on Computing Science Education (SIGCSE '16), ACM, New York, NY, USA, 702-702. 2016. DOI: https://doi.org/10.1145/2839509.2850479
Altadmri, A., and Brown, N., C.C.,. 37 Million Compilations: Investigating Novice Programming Mistakes in Large-Scale Student Data, In Proceedings of the 46th ACM Technical Symposium on Computer Science Education (SIGCSE '15), ACM, New York, NY, USA, 522-527, 2015. DOI: https://dx.doi.org/10.1145/2676723.2677258
Brown, N., C.C., Kölling, M., McCall, D., and Utting, I., Blackbox: a large scale repository of novice programmers' activity, In Proceedings of the 45th ACM technical symposium on Computer science education (SIGCSE '14), ACM, New York, NY, USA, 223-228, 2014. DOI: http://dx.doi.org/10.1145/2538862.2538924
Brown, N., C.C. and Altadmri, A., Investigating novice programming mistakes: educator beliefs vs. student data. In Proceedings of the tenth annual conference on International computing education research (ICER '14), ACM, New York, NY, USA, 43-50, 2014. DOI: https://dx.doi.org/10.1145/2632320.2632343
Brown, N., C.C., Introduction to analysing the BlueJ blackbox data (abstract only), In Proceedings of the 45th ACM technical symposium on Computer science education(SIGCSE '14), ACM, New York, NY, USA, 748-748, 2014. DOI: http://dx.doi.org/10.1145/2538862.2539012