HouseMind: Tokenization Allows Multimodal Large Language Models to Understand, Generate and Edit Architectural Floor Plans
Published in: CVPR 2026, 2026
A multimodal large language model that unifies floor plan understanding, generation, and editing via discrete room-instance tokens, enabling controllable and interpretable operations.
Recommended Citation: Qin, S.Z., Weber, R.E., Lu, X.Z., 2026. Tokenization Allows Multimodal Large Language Models to Understand, Generate and Edit Architectural Floor Plans. CVPR 2026. https://arxiv.org/abs/2603.11640
PaperURL | Project Page












