Integrated statistical inference for software effort estimation based on incomplete historical data.

  • Marco Antonio Guzmán López Instituto Tecnológico y de Estudios Superiores de Monterrey
  • René Santaolaya Salgado Centro Nacional de Investigación y Desarrollo Tecnológico
  • Vitervo López Caballero Centro Nacional de Investigación y Desarrollo Tecnológico
  • Blanca Dina Valenzuela Robles Centro Nacional de Investigación y Desarrollo Tecnológico
Keywords: estimation by analogy, feature selection, imputation techniques, missing data, project success statistics

Abstract

Inaccurate estimation of effort in software development leads to functional non-compliance and significant deviations in planned cost, time, and resources. To address this problem, this study proposes a novel data imputation scheme called Integrated Statistical Inference (ISI), which combines imputation by mean or mode with a proposed method based on statistical information from software projects (ISPI). The ISI scheme allowed for the imputation of 29% of the missing data in a random sample from the ISBSG repository. Additionally, a regression model was developed to predict effort, achieving an MMRE value of 14.05%, considered acceptable in the literature.

Published
2026-05-01
Section
Artículos