G18 : A Comparative Analysis of Model Complexity and Predictive Reliability Under Data Scarcity Conditions


Students Grace Bui
School HCDSB - St. Mary Elementary School - Oakville
Level Junior 7/8 - Grade 7
Group Group 8 - Engineering and Computing II
Abstract The purpose of this project is to investigate how machine learning model complexity affects predictive reliability when training data is limited. Many machine learning models achieve high accuracy on full datasets, but their performance can degrade unpredictably when data is scarce or noisy. By comparing a simple model (logistic regression) with a more complex model (random forest), this study aims to determine which type of model maintains more stable and robust predictions under realistic constraints. The goal is not to build a new AI system, but to analyze how model complexity impacts reliability, overfitting, and generalization in practical deployment scenarios.