Sometimes, a great idea takes an unexpected turn that makes it even better. When VoiceThread’s founders first started dabbling in software, they aimed to answer a key question: If you took an image and recorded the audio stories behind it, would this digital artifact have more value?
It turns out that the answer was a resounding yes, but perhaps not in the way that they imagined. Their prototype, a product aimed at helping families stay connected, caught the eye of educators who immediately saw a greater potential.
VoiceThread is a cloud-based media tool that allows users to collaborate on and communicate around media presentations. These VoiceThreads, as they are called, include images, video, and documents, and they provide the opportunity for teams of people to annotate using drawing tools, voice comments, written comments or video responses. They can be used to provide feedback, co-author, narrate and even host asynchronous meetings.
In a recent conversation, Steve Muth, Co-Founder and President of VoiceThread, explained more about his vision. He also provided some details on the company’s journey and how new technologies play a role in their success.
EdSurge: VoiceThread was originally a consumer-facing product. Why did you make the transformation to an education and business tool?
Muth: The moment we launched, we got flooded with all of this feedback, all coming from teachers. In fact, there was this cadre of kindergarten teachers who had this one use case which we just never imagined—working with pre-writers on a typical kindergartener activity.
Kids draw something, and then the teacher acts as a scribe and writes out what it is. And so you get a sentence of text underneath a picture. Then it gets stuck up on the wall in the classroom, and at parent-teacher conferences, parents will come through and see it. And maybe at the end of the year, kids will get it in a little folder that they get to take home with them.
What the kindergarten teachers immediately recognized was, “Whoa, I can get out of the middle. They don’t need a scribe if I can record what they’re saying.” Because those pre-writers—as anyone who knows little kids will tell you—have no problem storytelling at all. They have plenty to say! But the teacher has 20 students to go through. There’s a limited amount of time, and so she gets one sentence out of the whole thing, and it’s not in the kids’ own voice.
These teachers took their phones out, took some pictures of the students’ work, pressed record, and then said, “Tell me what’s going on here.” It was just like our original thought: a digital artifact was worth so much more after the voice had been added to it.
How does VoiceThread support social-emotional learning in a classroom environment?
We worked with a professor teaching a history of photography course fully online. It included a number of students who were lifelong learners—people who were coming back to college—and some were not very good at technology. They had a hard time to begin with, and she understood that.
This one student was given an assignment to evaluate a piece of art, and they did it, but they did it wrong. They misread the assignment instructions and used the wrong criteria to deliver it, but you can hear in the student’s voice that they are earnest, thoughtful, engaged. They just misunderstood the instructions.
Because of VoiceThread’s setup, the teacher, using only her voice, was able to give a correction that was so clearly supportive and generous. In comparison to the “red checkmark” experience, students get this feeling that, “Okay, I did that wrong, but here’s a teacher who cares about me, cares about me succeeding.”
For me, that’s it in a nutshell. If you combine a great teacher with this high signal, rich interactivity, that’s what’s possible. Asking the teacher to do that in text is dicey. It’s a much more difficult job to convey a message. You’re reverting to emoticons, and then you have to factor in any reading comprehension problems the student may have. You could do everything right and still be misunderstood. Voice is very different. It’s one of those very core things that we experience from birth.
New technologies have played a large role in your development. Can you share an example of that?
We built VoiceThread from the beginning on Amazon Web Services and continue to use many of those services. The moment we saw Amazon Transcribe announced we were like, “Oh boy, we’re going to use this!” It was so obvious that this was going to crack a nut that had been bothering us for a long time.
We had captioning before that, but you had to build them by hand. Nobody wanted to do it because it was so much work. So we knew immediately we had to do this and that we could do it in a way that no one else had done before.
There are no settings on our automatic captioning. It’s fully automatic. At first, when we were demoing it to administrators, we would have to explain that over and over because they would say, “How are you managing all the requests for jobs?” We told them, “There are no jobs to be managed. We’re just captioning everything.”
There is not even a switch in their interface. Once they get that service from us, everything is just captioned. So, it really is a transformative thing. It’s an example of the appropriate use of big tech and its impact on the learning space. It’s not replacing teachers. It’s not pretending to replace them. Instead, it’s doing something that a teacher shouldn’t have to do anyway, which is to create captions. That’s not a good use of a teacher’s time.
We love that, and we’re looking forward to other stuff like Amazon Polly [text-to-speech], where we’re going to make it so that it’s like the reverse. All of the text comments should be able to be heard as well. We want you to be able to play a VoiceThread from the beginning of the conversation to the end and pick your medium. We feel like that needs to be the model—coming at the problem from both sides.