How to play a foreign language

Would you teach a child how to swim by having the child read a book about swimming? I doubt it. Different abilities require completely different approaches to learning. In this post, I’m going to talk about the differences between becoming fluent in a foreign language and learning other abilities, and why insisting on applying concepts that belong to other abilities is preventing us from learning foreign languages as well as we could.

It is usually assumed that you have to practice speaking and writing in a foreign language a lot to become good at it. Certainly, there’s a whole range of abilities that require a lot of practice to be performed successfully. That’s especially true of abilities that require the formation of muscle memory, like playing sports or playing an instrument. It is, however, not reasonable to extend that assumption to all the possible abilities a person can learn. “No pain, no gain” doesn’t apply to all aspects of human existence. Certain abilities don’t require that much practice at all, and can be mastered only with input. One such example is the ability to spelling correctly. Research shows that children manage to consistently forget every spelling rule they were taught in school and that the only important factor affecting how well people spell in the long run is how much reading they do (How well do people spell?).

I was inspired to write this article by J. Marvin Brown’s book From the Outside In. In Brown’s example, he compares language acquisition with learning to play the piano. That inspired me to go a bit deeper into the differences and similarities of the two. I’m going to explain why it makes sense to follow two completely different approaches to learn the two abilities, even though the exact same principles apply to both. My goal with this example is to explain why we should change our intuitions about how we learn foreign languages.

Disclaimer: I have never learned to play an instrument to any level of proficiency, so in this post there will probably be some errors and misconceptions about it. Please leave a comment if you find any, and I will happily correct them.

What’s a mental image?

In his book From the Outside In, Dr. Brown argues that the most important thing we need to have in order to perform any action is “a mental image of the result of the action”, and if that result is clear, simply practicing it will allow us to become very good at it. This is very similar to the reasoning and experience collected in the book The Inner Game of Tennis by Timothy Gallwey. Gallwey didn’t consider language acquisition, but Brown did. He concluded that the main difficulty in language learning is not the practicing, since the only reason why we need to practice speaking is to learn the physical movements required to use our mouths to produce sounds. The main difficulty for becoming fluent in a foreign language is building the mental image itself. After that mental image is clear, it takes a minimal amount of self-correction to make your speaking (pronunciation, rhythm, intonation and so on) closely match that of native speakers.

The two sides of every ability

Learning any ability has, therefore, two very distinct parts: mental image and physical skill. The mental image gives you a clear goal to aim for. Physical skill is what you need to develop to become better at reaching that goal using your muscles. Let me give a few examples. The mental image when you play the piano is your memory of the sound of the song you want to play. The physical skill side involves many hours of practicing hitting the keys, adjusting your movements to be able to move from one key to any other key cleanly, hearing the sound that is produced, and getting a sense for what every key is going to sound like before you even press it. For tennis, the mental image simply involves visualizing the place on the opponent’s court where you want the ball to go. The physical skill is what you develop by hitting the ball repeatedly and correcting your movements progressively until you get an intuition of where the ball will go after hitting it.

Both learning to play the piano and becoming fluent in a foreign language are composed of these two parts. The difference is in how much time we have to dedicate to each of the two parts. In the next two sections I’m going to explain mental image and physical skill in depth, and why there is so much difference between learning to play the piano and learning a foreign language.

Mental image

Becoming fluent in a foreign language has many similarities with learning to play a song on the piano. In both activities, you have to conform yourself to a “song”. When playing the piano, that means performing a song that you or somebody else composed. In language learning, that means conforming yourself to the conventions of that language, to the agreed upon meaning of words, word usage, pronunciation, rhythm, and intonation. Dr. Brown used the term “mental image” to refer to the representation of that “song” in our brain.

When trying to memorize a song, we are helped by the fact that music has many constraints. The pitches available are limited to the keys on the piano. Our piano has been tuned by a professional, and therefore we don’t need to worry about that C note being off tune. The possible durations for each note are also limited (whole, half, quarter, etc). Additionally, humans have an innate sense for music. We know intuitively what notes sound nice, and that helps us correct our mistakes when playing music. Also, because we have been hearing music since we were babies, a great deal of the mental image necessary to play a song is already there.

On the other hand, the “song” of language is really complicated. Human speech involves arbitrary intonations, rhythms, and an infinite continuum of possible mouth positions and coordinations. Language is a completely arbitrary convention. That means that when we first start learning a foreign language, we have no feeling at all for what sounds okay, and there’s no immediate, precise outside feedback that tells us how close we are to our goal when using it. There’s nobody tuning our Cs in this case.

Since musicians are very familiar with the scales and chords, they have very strong intuitions for which successions of notes make sense, and things like memorizing a tune after hearing it a single time become a possibility. The equivalent of this. when talking about languages, happens when a person’s brain has had enough time to figure out the patterns of phonemes in the formation of words, the different intonations of words and sentences and so on. In this case too, the learner has an intuition for what combinations of sounds a word can be made of, and it becomes a possibility for the learner to remember a word after hearing it a single time.

However, how do we get to the point of having such a strong mental image? It’s definitely not by practicing writing and speaking, and definitely not by getting corrections. As we will see in the next section, these things belong to the physical skill. The only way we can create a faithful mental image is by creating, in our mind, the neural connections between words, their meaning, their sound, and their usage. The only way to do that is by receiving lots of input in the language. That input needs to be understandable (so that our brains can connect the words to their meanings), it needs to be natural (so that our brains can connect the words to their correct usage), it needs to be plentiful (so that we give our brains the chance to figure out the patterns in the pronunciation and usage) and finally it needs to be engaging (so that we are paying attention in the first place).

Physical skill

Building the physical skill that allows you to play the piano takes a lot of time. It’s not easy to move your hands over a long distance at the pace of a fast tune and to hit the keys at the exact position at the precise moment, even if the mental image of the song in your head is very clear. The need to use both hands and several fingers at once makes it quite complicated. All this muscle memory and hand-ear coordination need to be internalized really well so that the musician can move his or her hands around without thinking. This instrument-specific practice is a big part of learning how to play an instrument, and that means that even a world-class pianist still needs a lot of practice to become a good guitarist. This hand-ear coordination is by all means not a natural one. That’s why building the physical skill that is going to allow you play the piano requires a vast amount of practice time.

When speaking a language, the instrument we need to play is our mouth. Luckily, using it to produce sounds is a very natural thing. Additionally, when we try to start pronouncing words in a foreign language, we are not babies anymore, so we are already quite adept at using our mouth. We have practiced producing sounds with our mouths in many different ways for many years. We have a lot of practice pronouncing words in our native language, but we have also tried making funny voices, imitating people with different accents, and even using our voice as an actual musical instrument when singing.

We definitely need some practice to produce sounds we have never produced before, for example those belonging to a second language. However, even native-like pronunciation requires mouth movements that are only slightly different to the ones we use in our native language. Things like placing your tongue on your lips instead of on your teeth, articulating a vowel in a slightly different position, and so on, are very small variations from the pronunciation in your native language. That means that the amount of practice we need to pronounce words in a foreign language is actually very little.

Why is it then that many people who learn a language later in life fail to produce those (slightly different) pronunciations close to how natives do it? Many people living in a foreign country manage to retain strong accents decades after living in the country where the language is spoken, even if they speak it every day. We call that fossilization. The most notable example I can think of are university professors who spend several hours a day lecturing in the local language, but after years of lecturing, their students still have a hard time understanding them. The problem here is not slow progress. The problem is no progress at all, because people with that problem don’t know the difference between the way they speak and the way native speakers speak.

My best guess about why fossilization happens is that practice is useless unless we have a clear image in our mind of the goal we want to reach. That’s why I believe that practice is actually counter-productive until we have a clear mental image, and that anybody aiming to become fluent in a foreign language is going to obtain much better results in the long run by focusing on getting a lot of input at the beginning to improve that mental image, and starting to speak and write only when it starts coming naturally or when really necessary.

But …

As usual, I’d like to mention some common objections that come up when discussing this topic, and my take on them.

  • As adults, we can’t learn like children: Many people believe that as adults it’s impossible to form a mental image unconsciously the same way children do; therefore, the only way for an adult to learn a language is to be aware of the grammar and the meaning of words and practice using them until it becomes a habit. However, everyone who has learned a foreign language to fluency has had the experience of getting a feeling for the grammar, the nuances and the usage of words which were never explained to them. The experience of adults learning like children is therefore a common one, and I’m simply saying that it’s okay to do only that.
  • No matter how much I practice, I can’t roll my “r’s”: Being a native Spanish speaker, people from the States like telling me about this. Just after saying it, they proceed to roll an “r”. Their problem is not that they can’t roll an “r”, their problem is that they can’t do it like a native Spanish speaker. We’re back to the problem of not having a feeling for how far you are from your target and what you should aim for. If they had that mental image, they wouldn’t need to practice more than a few minutes for a couple of days to master it.
  • We need to practice the grammar: This is also a very widespread belief. However, as you have probably guessed if you read the rest of this post, grammar and constructing sentences is something that belongs to the mental image, not to physical skill. According to my experience, a total amount of zero practice is necessary to learn how to produce natural-sounding sentences, and we find many similar cases that back it up (case history of Richard Boydell).
  • We need practice to convert our passive vocab to active vocab: Cleopatra, gallows, horseshoe, shrine, blacksmith, Neptune, malaria. What do all these words have in common? They’re not used very frequently in daily conversation. Still, I don’t think many native speakers would have much of a problem producing most of these words fluently in a conversation. My experience when speaking my mother tongues (Spanish and Catalan), is that I routinely use words for the first time in my life without having had to practice them beforehand. My conclusion is that the difference between active and passive vocab is determined by the amount of exposure you’ve had to a certain word, not whether you have practiced saying it. If you have had enough exposure to the word, you will be able to produce it fluently.

What can we do?

If your aim is to become fluent in a foreign language, the best way to improve your mental image is to spend as much time as possible getting input in the language. The mental image you build of the language is going to be affected both by what you hear and by all the surrounding context, so try to get input in authentic contexts as much as you can. But not all inputs are born equal. Any minute you can spend listening the language in a real-life situation is going help you much more than reading, watching or listening to materials in that language. Also try to get input in many different contexts to help you get a well-rounded feeling for the different nuances and usage of words.

If you are a teacher and want to help your students build their mental images, try to provide them with input with as much context as possible. Role-plays are good. Even better, give your students a chance to experience the language in real-world situations. Don’t spend any time trying to get your students to practice writing or speaking or doing exercises. You should consider any time spent doing that as worse than doing nothing. Any physical skill developed without a clear mental image is going to get them further away from their goal, and any progress they make will need to be undone. As usual, it is very important that you give students a measure of their progress. Since regular exercises and exams are useless when trying to measure the mental image, they need an alternative way to measure their progress. At the beginning, the best way to do that is by giving them a sense that they understand more than they did before. You can center your input around one word that you’ll make sure they learn, and the next day you can casually drop it here and there, so your students realize they know more today than they did yesterday. Listening comprehension tests are also a possibility, but there’s no need for the questions to be in the foreign language.

Summing it up

In this post, I’ve tried to explain why our intuitions about learning in general can’t be applied to foreign languages and how the distinction between mental image and physical skill allows us to explain why different strategies are needed to learn different abilities.

In particular, it takes a long time to create a mental image when learning a language, but the necessary physical skill is relatively easy to learn when compared to learning to play an instrument. That means that we need to practice playing an instrument a lot, but much more listening is required to learn a foreign language. In my experience, when you have a clear mental image of a language, words come out your mouth fluently even if you have never practiced saying them before.

As usual, I would like to conclude this post by asking for your opinion. What are your ideas on the topic? Do you have any personal experiences or teaching experiences that agree or disagree with these ideas? Please let me know.

6 thoughts on “How to play a foreign language

  1. Hi Pablo, good post 🙂 I’ve got a few remarks:

    (1) While my personal preference these days is in line with the ideas outlined in your post, I am not convinced that this is the only way and that other approaches are fundamentally flawed. The reason for this skepticism is that I just can’t see the empirical data to back up such ‘radical’ claims. There are people who achieve native-like abilities in a foreign language who started out using traditional methods and then went on to fully immerse themselves in the target culture (often with spouses/partners from that culture). On the other hand, there are many students of ALG who never acquire native abilities, even though they may live in their target country. The ALG graduates who speak like a native, how many of them are there?

    (2) Kids do things adults rarely or never do. One thing that comes to mind is the intense social pressure to conform. Kids who mispronounce words are often mocked and shamed. This might be quite important for the acquisition of proper (in the sense of that particular group) pronunciation. Another thing kids do and adults don’t is to play with words, to make up new words, to repeat the same phrase over and over etc. Adults do this to some extent, but kids do it for hours, days and months on end.

    (3) There is a difference between what is required to comprehend and what is required to actually speak grammatically. It seems to me that we can often get away with only a partial mental image and still have full comprehension due to context and other clues, but in order to speak we need the full mental image. I suspect that the full mental image forms only through active use of the language, and that can then be considered a form of practice. I’m currently going through my second extended silent phase, and it’s just so apparent that only speaking will truly allow me to grasp the language fully and complete those mental images.

    (4) There are many kids who need some kind of (short-term) language support, for instance regarding pronunciation. I myself went through this when I was about 4 or 5 years old, and my issues got fixed back then. It’s not unusual, you will find speech therapists for children basically everywhere. Just the mental image alone might not do it, sometimes explicit instruction and practice seems to be required even for kids.

    What do you think? 🙂


    1. I’ve been thinking a lot about your remark number (3), Andrej. What you describe of a “partial mental image” sounds similar to the concept that Dr. J. Marvin Brown writes about in “Learning Languages Like Children” ( of “traces” that are carried by memories and the brain eventually accumulates to build language, or “the full mental image” as you write.

      “The brain can’t use sound traces to speak with, but it can use them to build language with,” Brown writes. “It’s the recognition of this fact that is the whole difference between ALG and other natural approaches.”

      Later on he writes: “The typical way that adults interfere with the [natural language-acquisition] process is to try to speak from a trace (before the full sound has been formed). But since the brain can’t use traces to speak with, the only way they can do this is to build the complete sound themselves (either from sounds in their native language or from their knowledge of phonetics).”

      (I’m assuming that while Brown talks of sound, the same idea also applies to other features of language like grammatical structure.)

      You wrote “it’s just so apparent that only speaking will truly allow me to grasp the language fully and complete those mental images.” I suspect that what’s happening is two things:

      1) You’re completing partial mental images using previously acquired language and perhaps conscious knowledge of language when speaking.

      2) Speaking with people is giving you further language input and experience from hearing and understanding what they say to you that is allowing you to complete partial mental images

      It could be a combination of both these things but probably much more the latter for you as you are generally trying to follow an ALG-like approach. ALG of course would suggest the former is deleterious in terms of being able to reach native-like ability. The latter I think can fit with an ALG approach as it can be done without conscious practice, and I suspect children often get language input through interaction in this way that completes mental images of language for them.

      I don’t think it’s necessary though to speak in order to complete the mental images if we have enough of the right kind of input. I think elements like variety, repetition, observed interaction, and strong context can help one create many more complete mental images without speaking.

      From this perspective the problem with relying mainly on things like, for example, the AUA Thai classes for input is that especially at advanced levels a lot of the classes can consist of lectures where one is getting the gist of almost everything but not enough repetition and other factors to build many full mental images in a relatively short time. While there are usually two teachers, there is often not much dialogue between them, with one teacher talking most of the time.

      In contrast I found with watching TV dramas often the constant back-and-forth of dialogue was very helpful and seemed to create “full mental images” of language. A lot of things would “pop up” in my head later, some of which I could then easily produce. But unlike AUA of course, things like TV dramas are not designed to be understandable for learners of the language so comprehensibility is lower.

      I suspect that listening to and observing conversations that are made highly comprehensible could be at least as effective as speaking and interacting oneself in creating “full mental images”.


  2. Interesting as usual Andrej 😉
    Please try to destroy my other posts when you get some time 😛
    I don’t have answers for everything, but I can tell you what my current thoughts are for each of the issues you mention.


    I don’t see it as a black and white thing, so even if you started with formal education, I think it all depends to which degree you ended up making connections to your native language, if trying to read early on and practice pronouncing things, or coming up with your own personal grammar for the language, if trying to say things before having an intuitive feeling for the grammar. Eventually, most people who reach a certain level are going to transition to doing mostly things that natives do, so I think if you do that transition early enough then you’re still probably going to end up with a pronunciation that’s not so bad and without making that many mistakes (both grammatical and mistaken word meaning). Of course I don’t have any kind of statistical proof for this. If you have any information on this topic I’d like to take a look at it.

    About the students using ALG method, I haven’t met almost anybody who has done ALG almost exclusively, so it’s really difficult to tell. Right now I’m trying to get in contact with some of the people who have done best at the AUA school. Eventually I’m planning on collecting some data from AUA students or alumni who have learned for more than a certain amount of hours, and get information about the total amount of time they have spent in each activity, and maybe get some recordings so that native people can score them on their understandability, native-likeness, fluency, etc.

    About living in the target country, I don’t know if that’s actually the best thing. Living in the country where the language is spoken may require you very often to speak the language before you are really ready, and that’s actually one of my concerns when I first came here. This is really supposing that the learner has the intention to stick to ALG. In reality I think most of the students will keep doing many of the things they would do in their own country, like conscious studying of the grammar, reading/writing, getting a private tutor and a language exchange partner, etc.


    I don’t see how this feedback could result in native pronunciation. Let’s for now ignore the fact that this mocking and shaming does not necessarily exist in all regions and cultures. The only kind of mistake I can think of adults and other children pointing out is when a child has an obvious problem to produce one specific sound, like using /f/ instead of /θ/ (saying “funder” instead of “thunder”), or /θ/ instead of /s/ (having a lisp). However, there are several reasons why this kind of feedback cannot account for native pronunciation:

    – This happens with a small number of sounds and doesn’t happen to most children, so a different process must be at work to allow most people to learn most of the sounds.
    – The feedback is not specific enough. I think I mentioned this in the article. You are only mocked for very big deviations from the standard. The points of articulation of vowels (the position of the tongue when saying them), for example, are not finite, but a continuum. There’s no way for that feedback to be specific enough to let the child know that the point of articulation of the vowel the child is trying to pronounce should be 3mm to the front. These corrections can’t be made by the child looking at how adults move their mouth either, since most things we do when speaking are not visible. The only way I can think such minute corrections can be done is if the child is able to correct based on the difference of what he’s hearing to what he’s saying.

    I think If this mocking and shaming was important to the acquisition of native pronunciation, nobody would end up with native pronunciation. Just knowing that you are doing something wrong doesn’t tell you how to do it right. And, if somebody drawing your attention to it is enough for you to correct it, that means that you can correct it by yourself just by noticing it.

    What you mention about kids repeating over and over is something that I’ve heard before, but I’m still not sure when that happens. In which situation do you hear children repeating the same thing over and over? I can’t remember either my little sister or all my little cousins ever doing that. Adults, however do that a lot when trying to learn a foreign language, asking a native person to repeat something repeatedly and imitating the native person. I constantly see students at AUA school repeating a word to themselves several times right after the teachers used that word. When thinking about children, the only thing that comes to mind is parents trying to get their children to repeat something by saying it many times to them, and the children simply not caring, or if children are old enough, just saying it once so their parents will stop bothering them.


    I agree that the mental image needs to be much more clear to speak than to understand language at the same level.

    Let’s suppose that you need to do an active use of the language to get the full mental image. Why do you think that’s so? You think that all the necessary information is in your brain but that, by speaking the language, it will get reorganized in such a way as to allow you to speak fluently? Or you think that by speaking you can get new information into your brain that wasn’t there before? Do you think it’s more about forming habits?

    Why is it apparent that speaking it will allow you to complete the mental image?

    Honestly I’m not that confident about this part myself, and I still haven’t a complete theory of what the effects of habits are, if they have any influence at all when putting sentences together, and if it habits can also affect mental image, so I’m interested in knowing why you think it’s apparent.


    This is a minority of children, so as I mentioned earlier it can’t be the principle by what learning actually works, since the majority of people learn perfectly fine without it. Even children who do need speech therapy, they still pronounce correctly all the other sounds in their language besides the ones they have trouble with. I have my suspicions that this may be an issue with over-protective parents wanting to be part of every single aspect of the education of their children, even in things that are best left alone, and over-stressing their children.

    Again, thanks for the comment, and I’ll be waiting for your answer in this one.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s