Scaling models on simple predictive objectives is often insufficient to overcome spurious correlations that degrade out-of-distribution generalization. While Domain Generalization (DG) methods aim to learn invariant representations–often based on causality principles–they can be computationally expensive and underperform simple Empirical Risk Minimization (ERM). We propose a data-centric alternative: Geometric Robustness via Invariant Training (GRIT). Instead of explicit causal modeling, GRIT enforces a geometric constraint during fine-tuning based on a small set of noisy invariant pairs, which implicitly encode an invariance property. We provide the first finite-sample analysis of this setting, showing that our framework generalizes latent linear causal models. We prove GRIT achieves robust generalization that scales at a rate of $O(1/\sqrt{k})$ with the number of pairs $k$, offering a scalable alternative to ERM or explicit causal modeling for out-of-distribution robustness.