CogCoM is a large vision-language model that solves problems with a series of image manipulations for more accurate results and clearer reasoning. via Hugging Face