Gaana Mangwao Icon Gaana Mangwao

How to caption multi-script short-form content?

Currently, music discovery is largely driven by Reels, TikToks, YouTube Shorts, and other short-form content. While adding captions helps make music videos more engaging and accessible, there’s limited screen space when viewing them on a phone.

Therefore, I showed Pakistanis and Indians different caption formats using music videos to test some assumptions that musicians had.

Music clips with Urdu, Roman Urdu and English captions

Q1. Why should I add Urdu lyrics if my music is not traditional or classical?

Regardless of the genre just being able to read song lyrics in Urdu script had emotional and aesthetic appeal for Pakistanis as one person said:

“(It) feels like a homecoming”

For musicians and fans, they rarely see Urdu online, so it really changed how they experienced the music. It also helped them pronounce lyrics correctly.

Captions generally make it much easier to understand lyrics and help you remember songs better. Half the people I talked to actually remembered feeling embarrassed in the past because they’d misheard lyrics. Captions really helped clear up what the actual words were and singing along to the song.

Q2. Will I alienate my fans who can’t read Urdu?

Captions, especially in Urdu and Roman Urdu, boost the music video experience for everyone, even those who don’t read Urdu. Indians, for instance, said the Urdu lyrics simply “add to the vibe.”

In India, with its many languages, people are used to seeing both native and Romanized scripts. The native script helps them pinpoint a song's origin.

Adding Hindi captions with Roman Urdu/Hindi further engages Indian fans. However, stick to no more than two scripts at a time to avoid overwhelming viewers. Consider making different video versions, choosing from Roman Urdu, Urdu, Hindi, and English translations.

Q3. What if my fans are used to reading Roman Urdu?

Some participants mentioned they liked having both Urdu and Roman Urdu captions because they could easily switch between them.

Roman Urdu often served as a helpful backup, especially for understanding complex vocabulary.

When songs were slower, though, the Urdu script was usually preferred.

Q4. What should I do if my song contains English and Urdu?

Roman Urdu works the best for bilingual songs. People in Pakistan just aren’t used to reading Urdu in its traditional script, and it’s even harder when English words are written that way.

Plus, trying to read both Urdu and English script in the same line is a mental workout, mainly because you read them in opposite directions.

Q5. How do I get the pretty Urdu script?

The pretty script is called Nastaliq.

Showing example of Nastaliq and Naskh

If you want to use it for typing or in different fonts for your music videos, and it’s not already on your device, you can download it from Google Fonts or Urdu Fonts.

If you have an Apple device then you can see Nastaliq even in your Notes app if you follow instructions here.

Q6. How can I make my subtitles accessible and beautiful?

Subtitling is also an art form, and how we add those subtitles can either enhance or detract from the aesthetic vibe of a music video.

  1. If you’re using Urdu script, make the Urdu font a bit larger. It generally needs more space than English fonts for readability.
  2. For English or Roman Urdu, sans-serif fonts like Open Sans, Roboto, Inter, Lato are easier to read.
  3. Crop your videos to a 9:16 ratio so they fill the entire mobile screen. This is what people are used to seeing on platforms like TikTok and Reels.
  4. Don’t just use a simple white font color. When the video background gets light, white text can be hard to read. Instead, add a drop shadow, glow or thin text outline to make the lyrics stand out against the video.
  5. Avoid putting a plain black background behind your lyrics for contrast. Even though it makes the words more legible, music fans found it didn’t look good.
  6. Stick to a maximum of two lines of text at a time especially for mobile screens.
  7. Putting captions in bottom-center is the standard and does not cover other important visual elements.
  8. Finally, if your video has fast-moving scenes, don’t make viewers read lyrics on screen. They might get motion sickness.