UIs and mental models

I read an interesting reddit thread recently that got me thinking about this topic. The question was deceptively simple: “How do you picture a year in your mind?”, and the responses were mind-boggling. Many people said they visualize a year as a circle, with follow-up comments debating whether they see the months laid out clockwise or counter-clockwise in order, or whether it starts at at the 12-o’-clock position or the 3-o’-clock or some abstract number. Others (myself included) were flat-out stunned that people see a year as a circle. I have always visualized it as a slightly tweaked version of a typical calendar grid, with the months laid out horizontally but staggered vertically at the end of the every third month.

This got me thinking about the mental models people build around the basic things they learn and how it sticks with them over time. For instance, I attended kindergarten school in India. When they taught us the cardinal directions, the teacher always put up a map of India to indicate where North, South, East, and West were. It was always in terms of the “northernmost state” or the “westernmost state”. This is how I learned the directions and where they were in relation to each other. To this day, every time I need to visualize where I’m facing and what direction is immediately to the left or right of me, I always picture a map of India in my head and mentally lay out the directions in order to get my bearings. I’ve been living in the United States for over twelve years now, but my mental model is still stuck to that of the Indian map and where the northernmost, easternmost, southernmost, and westernmost states are located.

The same goes for learning the spellings of words. I did a lot of quizzes growing up in grade school where a part of the examination was to hear what someone (usually the teacher) was speaking out loud and write it down. This forced me to listen to every single word, break down its spelling, and write it out in its own sentence. Often, I’d have to go back and correct the spelling of commonly misspelled words like you’re/your or their/they’re/there because of the new contextual information I would have about the sentence or the subject once the teacher was done with it. This taught me to pay very close attention to the words people were saying and mentally spell it out in my head as they were speaking. Over time, I found it very easy to guess the proper spelling of a word just by hearing someone say it, thus staying true to the stereotype of Indian people being great at spelling bees.

When I moved to the United States, I was taken aback at how many spelling mistakes (and even grammatical ones) that students in high school and college were making in their writing. Students were supposed to have mastered this years ago and were now supposed to be focusing on critical thinking and analysis. Yet, it seemed like they were held back by the basics of how the English language is structured. I then realized that a lot of why things about words and spellings “clicked” for me was because I had built my mental model of English around the spelling of words and how they string together to form a sentence. Students who grew up speaking colloquial English with their peers, however, must have built their mental model of the language based on spoken English. They focused on things like when to emphasize certain words, where to slow down and where to speed up, when to break up the speech, and how to get it to flow smoothly from one sentence to the next. I was learning and speaking four different languages growing up and was interchanging them frequently at school, at home, and with friends, so I was terrible at this with English but others around me seemed to have mastered it. Over time, I got good enough at it as well, but it was quite the struggle at first. This just goes to show how much of an impact mental models can have on something so basic and yet so crucial as language and basic written communication.

In that reddit thread I mentioned earlier, there were quite a few responses from younger folks (born around 2005 or later) who mentioned that their visualization of a year was based on the iOS date picker. Yes, the scroll wheel thing that pops up at the bottom of your screen when you need to enter a date on a website or a mobile app. This blew my mind. These kids had probably started using iPhones at a very young age and were thus exposed to the mobile UI pattern of an iOS date picker much earlier than they ever saw a physical calendar. They visualized years progressing as a scrolling list of numbers (…2012, 2013, 2014…) with the current year highlighted in the middle. Same goes for the date and the month. These kids are going to live their entire lives, maybe fifty or sixty years from now, looking at a year this way.

This may not seem like a big deal if you’re not a designer, but it actually is. This is how these people are going to perceive the passage of time, a concept so elementary and so intuitive to humans that it has always been evolving from sundials to wrist watches to LED displays. When the designers at Apple came up with the UI, they didn’t go “this is how we want people to perceive time from now on”, they simply based their designs on existing real-world calendar scrollers where you had to adjust the current date every day. It used to be incredibly skeuomorphic when it first came out and got simplified in iOS 7. They tested it for usability and accessibility, not necessarily for how a future generation growing up would warp their sense of chronology. This means that an entire generation of kids now sees time as scroll wheels and date pickers.

I’m sure there’s countless other examples of this. When you see an empty text field, you expect to see a keyboard input. A generation ago, people who saw an empty line on a piece of paper would ask for a pen to fill it out by hand. Swipe gestures and their relevant affordances have completely changed how people interact with screens. The kids just intuitively tap on every screen they see expecting it to be a touchscreen. They build mental models of how letters in the English language are ordered based on the layout of a QWERTY keyboard, before even learning the ABCDE order.

When designers make these things, they’re simply doing it in order to meet a business objective with the minimal amount of time spent on it to ensure that it is just enough in terms of simplicity, accessibility, and usability. It works in user testing and people seem to get it. Great, let’s ship it. Once a product that does well really skyrockets and gets the level of market penetration that the iPhone did, then and only then does the whole mental model thing become an issue. The designers find themselves in a tough spot where they want to change it and make it better but so many users have already “learned” the interface the way it is, so it’s going to upset them even more and cause more re-learning if they change it now.

UI design has gone a lengthy way in just the past ten years, going from extremely skeuomorphic to flat to minimalist to nearly no UI at all now with voice assistants. Without a visual reference of any sort, it really makes me wonder how the newest generation of kids who are talking to Alexa and Siri before they even talk to a schoolteacher are going to build their mental models about how human beings respond when they are asked a question. You can get very deep and dystopian with this when you think about it for too long, but it’s a question the designers weren’t prepared to answer when they initially built the dialogue responses for these voice assistants. It boggles my mind to think about this and how the work we designers do every day can have such a massive impact on an entire generation of human beings. It’s both highly empowering and incredibly terrifying, so I truly hope that this bothers and inspires other designers out there as much as it does to me.