Translating the use of modern machine learning algorithms into clinical applications requires settling challenges related to explain-ability and management of nuanced confounding factors. To suitably interpret the results, removing or explaining the effect of confounding variables (or metadata) is essential. Confounding variables affect the relationship between input training data and target outputs. Accordingly, when we train a model on such data, confounding variables will bias the distribution of the learned features. A recent promising solution, Meta-Data Normalization (MDN), estimates the linear relationship between the metadata and each feature based on a non-trainable closed-form solution. However, this estimation is confined by the sample size of a mini-batch and thereby may result in an oscillating performance. In this paper, we extend the MDN method by applying a Penalty approach (referred to as PDMN). We cast the problem into a bi-level nested optimization problem. We then approximate that objective using a penalty method so that the linear parameters within the MDN layer are trainable and learned on all samples. This enables PMDN to be plugged into any architectures, even those unfit to run batch-level operations such as transformers and recurrent models. We show improvement in model accuracy and independence from the confounders using PMDN over MDN in a synthetic experiment and a multi-label, multi-site classification of magnetic resonance images.