Current challenges of clinical breath analysis include large data size and non-clinically relevant variations observed in exhaled breath measurements, which should be urgently addressed with competent scientific data tools. In this study, three different baseline correction methods are evaluated within a previously developed data size reduction strategy for multi capillary column - ion mobility spectrometry (MCC-IMS) datasets. Introduced for the first time in breath data analysis, the Top-hat method is presented as the optimum baseline correction method. A refined data size reduction strategy is employed in the analysis of a large breathomic dataset on a healthy and respiratory disease population. New insights into MCC-IMS spectra differences associated with respiratory diseases are provided, demonstrating the additional value of the refined data analysis strategy in clinical breath analysis. (C) 2016 Elsevier B.V. All rights reserved.