Multimodal resources (e.g., images, sounds and animations) play an important role in many aspects of education, such as the development of teaching and learning materials, classroom teaching, and assessment. This has led to on-going research interests in how knowledge is visually constructed in education, including foreign language education. In the past two decades, English textbooks in China have made extensive use of multimodal resources, especially images. However, there are problems in the visual construction of language and general knowledge in currently used textbooks, and classroom teachers are not always able to use multimodal resources effectively in classroom teaching. The paper attempts to investigate these problems and put forward solutions, with a focus on primary English teaching.