How the “secret language” of filmmaking works

Image © Adam Westbrook

Thanks to YouTube, Netflix, and even Instagram, more of us are learning to convey our experiences visually. But what does it really mean to be a visual storyteller?

In our frenetic and visual world, where we are constantly assaulted with a high definition barrage of TV, Netflix, YouTube and cinema, it is hard to imagine how the first moving images, flickering raggedly at 12 frames per second on the wall of an 1888 workshop in Leeds in England, must have appeared to their inventor, Louis Augustin Le Prince.

“Roundhay Garden Scene” — the earliest known moving image scene, by Louis Augustin Le Prince in 1888.

We know moving images, despite their crude early form, had a magical quality for those who saw them. But like all media, it took people a long time to figure out what they were doing with it.

It wasn’t until 1903 that an American cameraman-turned-director, Edwin S. Porter, realised that you could tell a story by cutting together different - separately filmed - shots. The result, was The Great Train Robbery (1903) an early landmark in narrative cinema.

And 20 years after that, filmmakers and audiences alike were still grappling with this mysterious “seventh art”. What were its secrets? An unexpected voice in this debate was the writer Virginia Woolf who, in 1926, wondered in an essay about the fundamental elements of the moving image:

“If a shadow at a certain moment can suggest so much more than the actual gestures and words of men and women in a state of fear, it seems plain that the cinema has within its grasp innumerable symbols for emotions that have so far failed to find expression…Is there, we ask, some secret language which we feel and see, but never speak, and, if so, could this be made visible to the eye? Is there any characteristic which thought possesses that can be rendered visible without the help of words?”

And maybe there, captured in a paragraph, is what we visual storytellers are ultimately trying to find expression for.

Heaven’s infinite mercies

Picture the scene: it’s a bleak autumn afternoon in a lonely hilltop cemetery. Brown leaves are tossed between the crooked gravestones by a biting wind, as we make our way over to the only new plot. Standing beside the gravestone stands a veiled weeping woman, dressed in black.

This was the set-up for a well-known short story by Ambrose Bierce. Here, encoded in words, we have two vivid images: a newly dug grave in a lonely cemetery, and a woman weeping beside it. If you were to describe the woman in a single word, what you say?

Like many of us, in your mind’s eye, you’re probably picturing a widow.

Bierce’s story continues with the approach of a stranger, walking through the cemetery. He sees the woman’s distress and tries to comfort her. “Console yourself, madam,” he says. “Heaven’s mercies are infinite. There is another man somewhere, besides your husband, with whom you can still be happy.”

“There was” she replies, “but this is his grave.”

The unexpected twist in The Inconsolable Widow plays upon our brains way of decoding images.

And within it lies the secret heart of visual storytelling.

The third something

It wasn’t the image of the gravestone alone that made you think “widow”. Nor was it purely the woman in black. It was the combination of these two images that creates the idea.

And it is in this juxtaposition of images that the energy, the meaning of an idea is contained. And, if the audience can decode the image correctly then, like nuclear fission, the energy behind this idea is released in a sudden illuminating glow.

In his 1938 essay Montage, the pioneering Soviet filmmaker Sergei Eisenstein put this even more simply:

“When we see two objects placed side by side we draw certain conclusions almost automatically.” He continued:

“This property reveals that any two pieces of film stuck together inevitably combine to create a new concept, a new quality born of that juxtaposition”

He called this “the third something”.

There was nothing in my description of the windy hilltop cemetery to tell you that the woman in black was a widow. Your brain made that connection all by itself, because the images I used, when combined, inevitably encode that idea.

It turns out that our brains can’t help it. They are connection machines. Show me a picture of a boiled egg, followed immediately by an image of the Vatican and I’ll immediately start searching for connections — even if there are none.

Inside this desire to connect the unconnected is where good visual storytellers hide their ideas.

It is very basic but this concept is as important to visual storytelling as chord structure is to the musician. And it applies not just to filmmakers and video producers, but anyone telling sequential stories with images. That includes audio slideshow producers, photojournalists, even comic-book artists.

Talking about comic books, in his much-admired book on the art, Understanding Comics Scott McCloud sees the same force in action.

“When taken individually, pictures are merely that, pictures…however when part of a sequence, even a sequence of only two, the art of the image is transformed into something more…Animation is sequential in time but not spatially juxtaposed…each successive frame of a movie is projected onto exactly the same space — the screen — while each frame of comics must occupy a different space. Space does for comics what time does for film.”

Closing the gap

“You always want to tell the story in cuts. Which is to say, through a juxtaposition of images that are basically uninflected.”

That’s how screenwriter and director David Mamet explains it in his book On Directing Film.

According to Mamet, this image juxtaposition is what documentaries have always done well.

“Documentaries take basically unrelated footage and juxtapose it in order to give the viewer the idea the filmmaker wants to convey. They take footage of birds snapping a twig. They take footage of a fawn raising his head. The two shots have nothing to do with each other. They were shot days or years, and miles, apart. And the filmmaker juxtaposes the images to give the viewer the idea of great alertness…They are not a record of how the deer reacted to the bird. They’re basically uninflected images. But they give the viewer the idea of alertness to danger when they are juxtaposed. That’s good filmmaking.”

Juxtaposing images — assembling them side by side in space or time — is so powerful because it creates a gap, which our brains are forced to close.

There are all sorts of biological reasons for this, which can be neatly summed up in our mind’s desire for closure, answers, certainty and patterns. “Here in the limbo of the gutter,” writes McCloud, “human imagination takes two separate images and transforms them into a single idea.”

And this makes your audience an accomplice, not just a passive viewer. You haven’t told them what has happened, or how they should feel: they figured it out for themselves.

An excerpt from Understanding Comics © Scott McCloud (1993)

Becoming a visual storyteller

So where do we start? Well, according to Eisenstein, we begin with an idea, which we then break down into visual fragments.

“The author sees with his minds eye some image, an emotional embodiment of his theme. His task is to reduce that image to two or three partial representations who’s combination or juxtaposition shall evoke in the consciousness and feelings of the spectator the same generalised initial image which haunted the authors imagination.”

So is filmmaking all about editing? Well, not necessarily, says Daniel Mercadante of award-winning collaborative Everynone, who have produced some strikingly visual short films for RadioLab among others.

“Stories are told by expressing more than one image in sequence” he tells me. “This needn’t be an edit. One static shot, of the same scene, is inherently a sequence of several frames, there is a cinematic story in that transpiration of images. Some of my favourite films are single shot pieces of very little happening on screen; but, time is involved, each frame leads to the next.”

Visual flip-flop

Other digital storytellers are taking these fundamental principles of the moving image and using them in exciting ways, for powerful effect. In his unique and award-winning film Green, French filmmaker Patrick Rouxel tells the story of a dying orang-utan named Green and explores the complex issue of deforestation, all without a single line of dialogue or voice over.

He told me that the secret is appealing to viewers’ emotions and not their intellect, while using the contrast and combination of images to drive the narrative forward.

“The first thing I wanted to establish is that the story is told from Green’s point of view, and I did so by inserting a few shots of the room’s ceiling and walls seen from her physical POV,” he says. “From the very beginning of the film, we see what she sees and we thus know it’s her story.

“Green is there to give a continuity to the story and a soul to the destruction of the rain forest. As the film unfolds, every time we leave Green, it is to witness more destruction, and every time we come back to her she looks weaker and weaker. Through this flip-flopping, my hope is that the viewer will feel the suffering behind the destruction rather than just see destruction.”

And that, perhaps, is the core of what visual storytelling can do. By combining and juxtaposing images, whether in space or time, we open up intriguing gaps which our viewers minds are compelled to close.

And in connecting these images in their own minds, our audiences tell themselves a story, a story which happens in their hearts, not just their heads.

This article first appeared in Issue #2 of Inside the Story Magazine.

Video artist working at The New York Times. I write a weekly newsletter about visual storytelling and creativity.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store