MESA is a novel generative model based on latent denoising diffusion capable of generating 2.5D representations of terrain based on the text prompt conditioning supplied via natural language. The model produces two co-registered modalities of optical and depth maps.
This is a test version of the demo app. Please be aware that MESA supports primarily complex, mountainous terrains as opposed to flat land
⚠️ The generated image is quite large, so for the larger resolution (768) it might take a while to load the surface