Transcription in Frame.io lets you generate an automated transcript from spoken dialogue in video and audio files for easy navigation during playback, searching for key moments within seconds, exporting transcript files in an instant, and much more. Now reviewing and approving your dialogue, podcasts, and audiobooks can be streamlined more efficiently in Frame.io.
Using Transcription
To create a transcript for an audio or video file, first click on the Transcription icon in the bottom-right of the playbar and a new panel will appear on the left.
Select one of 27 supported languages (see the full list of languages below) to transcribe into that will match the audio/video file and click Transcribe and begin to generate. You can also select the Auto option to auto-detect the supported language. The transcription generating time will usually only take a few minutes to complete before it is complete.
Note: The Transcription feature does not offer translation services, so it is only recommended to select the language that matches your audio/video file.
When the transcript becomes available, you will see the text broken down individually by speaker with Speaker ID during playback, the text in the transcript will be highlighted and sync with the video/audio playback. The transcription will also auto-scroll during playback to help follow along while watching the video.
Note: You will receive notifications when transcriptions are finished generating, but only for transcripts you generate. Notifications can be managed in your Account Settings.
You can navigate through your audio/video file using the transcript by clicking on any word, and it will bring you to that exact timeline of your playback. This means that if you want to leave a comment on the exact timecode of a word in the transcript, clicking on that word and then creating a comment will link those timecodes together. Additionally, highlighting a phrase in the transcript will allow you to create a range-based comment by clicking on the “+” icon that appears.
Use the search bar above the transcript to find exactly what word or phrase you’re looking for throughout the transcription instantly. All examples will be highlighted, and you can click arrow up and arrow down as shortcuts to scroll through the multiple results.
Using Captions
Once transcription is enabled, you will have the ability to toggle closed captions as well. Click on the “CC” icon next to the Transcription icon to enable this feature. Both captions and transcript will be in sync with your audio/video file during playback. Closed captions will also be available on iOS.
Caption Ingest
Transcriptions and captions are auto-generated with Frame.io when using this feature. If you would rather use your own captions using SRT or VTT files, you can choose to upload those directly into the video asset as well. From the CC icon, select Add Captions and click on Upload Captions. You can also find this option in the three-dot menu in the Transcriptions panel or in the Project page in the asset three-dot menu.
Choose the SRT or VTT file to load into the uploader and choose the correct language that is associated with the captions. Once everything is queued up and ready to go, click Add to Asset to begin uploading the caption file.
When the file is finished uploading, you can toggle turning on and off the synced captions on your video asset. You can now also choose to view the Transcript panel in the transcript or captions format.
Note: It's possible to see multiple captions listed under the same language in the Captions menu. To tell the difference, highlight the language listing to see what the source of the caption is; whether it is a Frame-generated caption or a caption ingest.
Editing Transcriptions and Captions
In the event that any of the transcriptions are inaccurate or needs to be adjusted, you can easily edit and make changes to them, and have those edits synced to the associated captions.
Go to the three-dot menu and select Edit for either Transcriptions or Captions to put you into Edit mode. You can also highlight a group of text and click on the Pencil icon or double click on any text to enter into Edit mode. Once your edits are made, click Save, or click Cancel to discard your latest edits. Finally, refresh the page to see the edits reflected in the captions of the video and have everything properly synced up again.
Label Speakers
Transcriptions will be able to identify the individual speakers in your media and separate their voices, with Speaker 1, Speaker 2, and so on as the label. This allows you to watch and follow your Transcriptions flow more naturally and conversationally, especially if there are multiple people talking in your media.
This is an optional feature and can be either checked or unchecked just before the transcription process begins.
This feature also asks for explicit consent anytime you want to use it because Speaker ID. If you wish to bypass this and accept consent for all future transcripts, check the box off and click I Agree the next time you see this disclaimer.
Note: Due to stricter laws, US users in Texas & Illinois are not able to use the Label Speaker feature.
Roles and Permissions
All roles can view transcripts and captions inside of a Project (once generated), but only some roles can generate and delete, as well as export.
In Project/Workspace Roles
| Owners/ | Full Access | Edit & Share | Edit | Comment Only | View Only |
Generate Transcript / Caption | Yes | Yes | Yes | Yes | No | No |
Delete Transcript / Caption | Yes | Yes | Yes | Yes | No | No |
View Transcript / Caption | Yes | Yes | Yes | Yes | Yes | Yes |
Export Transcript / Caption | Yes | Yes | Yes | No | No | No |
Search Transcript / Caption | Yes | Yes | Yes | Yes | Yes | Yes |
Edit Transcript / Caption | Yes | Yes | Yes | Yes | No | No |
In Shares, users can only view and interact with transcripts and captions if already generated in the Workspace/Project where the Share originated. No generations will happen within the Share.
In Shares
| Authenticated | Identified | Unidentified |
Generate Transcript/Caption | No | No | No |
View Transcript/Caption | Only if already generated in Project | Only if already generated in Project | Only if already generated in Project |
Export Transcript/Caption | No | No | No |
Search Transcript/Caption | Only if already generated in Project | Only if already generated in Project | Only if already generated in Project |
Settings
Access Transcription options in the three-dot menu next to the Transcript search bar. Toggle auto-scrolling on or off. Export Transcript in common caption formats (SRT, VTT, & TXT). Add an additional Transcript language to generate a new Transcript for your audio/video file. Remove the Transcript if you want to start over from the beginning.
Sharing
In the Share Builder, when creating a new or editing an existing Share link, you can configure permissions if want to show/hide available transcripts and automated captions. The transcript and captions need to first be developed before the option becomes available. Toggle the option on or off at anytime to enable or disable this feature for Reviewers.
FAQ
Q: What are the languages supported for Frame.io Transcription?
A: There are 27 total languages supported, including Arabic, Cantonese / Traditional Chinese, Czech, Danish, Dutch, English (US), English (UK), French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Mandarin / Simplified Chinese, Mandarin / Traditional Chinese, Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, Turkish, Ukrainian, and Vietnamese.
Q: Is there any easy way to generate multiple transcriptions at once?
A: Yes. From the project page, highlight as many assets as you’d like, then click on the transcription icon on the bottom-right to perform a bulk transcription task. All the selected assets will begin generating transcriptions in a single action.
Q: Are transcriptions handled internally within Frame.io, or externally through third-party tools? And how data is shared during this process?
A: Frame.io Transcription leverages a speech-to-text API to generate the transcript (same as Premiere Pro). No transcript data is collected or used for any training during generation of the transcript with the API. We will only process the audio data and store the resulting transcript file within Frame.io for purposes of showing in the interface.
Q: My company/business does not permit the use of AI functionality found in Transcription. Can I disable this feature entirely?
A: Yes. If you'd like to disable Transcription for your account, please have the Account Owner reach out to Support for assistance. This will turn off the AI-powered functionality for the entire account.
















