Create an OTT Streaming App for Mixed Content Formats
Horizontal and vertical video are not the same product. Building a platform that handles both with mixed content formats is an architectural decision, not just a UI preference.
YouTube serves horizontal documentaries and vertical Shorts on the same platform. TikTok started vertical-only and is expanding to horizontal. Instagram handles both in the same feed. The mixed content formats for OTT streaming apps are not a trend. It is the new baseline expectation for any mobile-first streaming platform.
For OTT founders, this is an architecture challenge that most platform teams underestimate. Horizontal and vertical video are not just different aspect ratios. They carry different UX conventions, monetization logic, and technical requirements across the player, storage, and delivery layers. This OTTclouds‘ article maps what your platform needs to support both formats without forcing users to navigate between them.
>>> Maybe you’re interested in:
- Vertical Video Streaming Apps for Content Providers
- White-Label Vertical Short Video App Development with OTTclouds
- [Case Study] Build a White Label Vertical Short Anime Series App “Weeby” for A Japanese Client
Horizontal vs. Vertical: More Than Aspect Ratio
The 16:9 horizontal format was built around the television and desktop experience, lean-back, both-hands-free, scheduled or browsed viewing. The 9:16 vertical format grew from mobile-native behavior, including one-hand, thumb-first, infinite scroll, discovered rather than searched for.
These different origins produce different user expectations at every layer of the experience.
| Dimension | Horizontal (16:9) | Vertical (9:16) |
| Viewing posture | Landscape, both hands, lean back | Portrait, one hand, thumb-driven |
| Content length | Long-form: 10 min to 2+ hours | Short-form: 15 sec to 5 min |
| Discovery | Browse, search, scheduled viewing | Infinite scroll, algorithm-surfaced |
| Monetization | Mid-roll ads, subscription, PPV | Pre-roll, coin unlock, brand integration |
| Player UI | Persistent controls, progress bar | Minimal chrome, tap-to-pause |
| Sound | Sound-on expected | Sound-off default — captions essential |

The Five Architecture Layers That Must Handle Both Formats
A mixed content format for OTT platform apps is not built by adding a vertical video section to an existing horizontal app. Every core layer of the platform must be designed to handle both formats with equal fidelity. These are the five layers where format-aware decisions are required.
1. The Video Player
The player is the most visible failure point. A player built for horizontal video that renders a vertical clip inside a landscape letterbox, with black bars on both sides, signals immediately that vertical content is a second-class experience on your platform. Users who encounter this leave. They have dozens of apps that handle verticals natively.
What is required: A format-aware player that detects the content’s native aspect ratio and adapts its rendering mode automatically. The player detects 9:16, switches to a full-portrait rendering with portrait-optimized controls (tap-to-pause, swipe-to-next, minimal chrome). The player detects 16:9, switches to landscape rendering with standard OTT controls (progress bar, quality selector, episode list). The transition must be seamless when a user moves between format types within a session.

2. Content Ingestion and Storage
Horizontal and vertical content have different transcoding requirements. A horizontal video optimized for 4K television delivery is processed differently from a vertical clip optimized for mobile data constraints. Storing both under a single undifferentiated content management system creates operational problems: wrong transcode profiles applied, incorrect thumbnails generated, metadata structured for one format but assigned to another.
What is required: Format-tagged content pipelines. At ingest, every piece of content is tagged with its native format. That tag propagates through transcoding (separate transcode presets for 9:16 and 16:9), thumbnail generation (portrait crops for vertical, landscape crops for horizontal), and metadata (episode structure for long-form, clip tags for short-form). The CMS must expose format as a first-class attribute, not an afterthought field.
3. Discovery and Feed Architecture
Vertical discovery in OTT is the layer most platform teams get wrong. They build a single discovery feed and attempt to surface both formats within it. The result is a visually inconsistent feed where horizontal thumbnails and vertical thumbnails compete for space in a grid that was designed for one of them.
YouTube solved this by keeping formats in separate sections of the app: the main feed for horizontal, the Shorts shelf for vertical, with a dedicated Shorts tab. Instagram keeps them on separate surfaces — Feed and Reels. The pattern is consistent: mixed content formats are best served through dedicated discovery surfaces, not through a single merged feed.
What is required: Separate discovery surfaces per format type, unified under a single navigation architecture. A horizontal content browser (grid or row-based, landscape thumbnails, title and metadata visible) and a vertical content feed (full-screen scroll, portrait thumbnails or direct autoplay, minimal text overlay). Users can move between surfaces fluidly. The algorithm behind each surface is format-specific — session behavior on vertical content is a different signal from session behavior on horizontal content and should not be mixed in the same recommendation model.
4. Metadata and Content Taxonomy
A horizontal series has seasons, episodes, runtimes, and a narrative arc. A vertical clip has duration, creator, topic tags, and a trending signal. These are structurally different content objects. Forcing both into a single content schema because the CMS was built for one format produces broken metadata across the other format: missing fields, incorrect episode counts, blank duration fields for clips, and absent creator attribution for series.
What is required: A dual-schema content model. Horizontal content: title, series, season, episode number, runtime, cast, genre, synopsis, rating. Vertical content: title, creator, duration, topic tags, trending score, challenge/sound association (if applicable), content series flag (if clips belong to a recurring format). Both schemas live in the same CMS but with format-specific field sets. Search and discovery logic queries the appropriate schema based on the surface the user is browsing.
5. Monetization Logic
Mid-roll ads in a 4-minute vertical clip are a poor user experience. A subscription paywall on a 20-second short defeats the point of the free scroll mechanic. Coin unlock for a 90-minute horizontal film requires different pricing logic than coin unlock for a 3-episode vertical drama arc. Monetization decisions that ignore format produce revenue models that work for one format and alienate users of the other.
What is required: Format-aware monetization rules. Vertical content: pre-roll or post-roll ads only (no mid-roll), coin unlock for premium vertical series arcs, brand integration as native content. Horizontal content: mid-roll ads for AVOD tier, subscription or episode unlock for premium access, pay-per-view PPV for live or premiere content. The platform’s monetization engine must route each content type to the appropriate paywall and ad logic based on format tag.
>>> Read more:
- Microdrama Apps for Low Tech Users: Building for the Audience That Actually Exists
- How to Choose the Right Monetization Models for Microdrama Apps

UX Design: The Navigation Architecture That Makes Both Formats Work
The navigation structure of a mixed-format platform is the design decision that determines whether the experience feels unified or fractured. Users should never feel like they are using two different apps. They should feel like they are using one platform that understands different types of content.
| Navigation Element | Horizontal-Only Platform | Mixed-Format Platform |
| Home tab | Horizontal content grid with rows by genre | Format-aware home: horizontal rows + vertical Shorts shelf below the fold |
| Content tab | Series browser, landscape thumbnails | Separate tabs or sections: Series (horizontal) / Clips (vertical) |
| Player entry | Tap thumbnail → landscape player launches | Format detected on tap → portrait or landscape player launches automatically |
| In-player navigation | Episode list, quality selector, progress bar | Horizontal: full controls. Vertical: swipe-up for next, tap-to-pause, minimal chrome |
| Autoplay behavior | Autoplay the next episode in the same format | Autoplay next item in the same format surface; no cross-format autoplay without user signal |
| Search results | Grid of horizontal thumbnails with title/runtime | Format-tagged results: icon or label distinguishes clips from episodes |
| Creator profiles | Cast/director page linking to their titles | Creator page with vertical clip feed + horizontal series they appear in |
Content Strategy: What to Publish in Each Format
Not all content belongs in both formats. Forcing long-form narrative content into vertical creates a poor viewing experience. Producing vertical clips of horizontal series is an effective discovery tool — but only when done intentionally.
| Content Type | Format Recommendation and Strategic Role |
| Long-form drama series (10+ min/ep) | Horizontal only – 16:9 with a full landscape player. Subscription or episode unlock monetization. |
| Microdrama series (2-8 min/ep) | Vertical primary – 9:16. Coin unlock or ad-supported. High scroll-discovery potential. |
| Short clips & trailers (<2 min) | Vertical. Used as discovery content for horizontal series. Link to the full episode from the clip end screen. |
| Documentaries & long-form content | Horizontal only. Not suited to vertical scroll behavior. |
| Creator-led content (vlogs, reactions) | Vertical primary. Creator profile page as a hub. Ad-supported or creator fund model. |
| Live events & sports | Horizontal primary. Vertical highlight clips are published post-event as discovery content. |
| Series previews & episode recaps | Vertical. Maximum 60-90 seconds. Link to episode. High conversion from vertical preview to horizontal series. |
>>> See more: Success Factors of a Microdrama Streaming App: A Complete Guide for Startups
Technical Requirements: What the Platform Must Support
Building for mixed content formats is not a front-end problem. The back-end requirements are equally significant and are where most white-label or SaaS platforms fall short when asked to support both content types at production quality.
| Technical Requirement | Why It Matters for Mixed Content Formats |
| ABR streaming profiles per format | Vertical clips are consumed on mobile data; horizontal on WiFi or TV. ABR profiles must be tuned per format. |
| Dual thumbnail generation pipeline | Every piece of content needs format-appropriate thumbnails: landscape crop (16:9) for horizontal browsing, portrait crop (9:16) for vertical feed. One pipeline produces wrong-aspect images for one format. |
| Format-aware player SDK | Single SDK with format-detection logic, not two separate players with conditional loading. Two-player approaches create maintenance overhead, inconsistent behavior, and session tracking gaps. |
| Unified analytics with format dimension | Watch time, completion, and engagement must be segmented by format. Averaging horizontal and vertical completion rates produces meaningless data for content investment decisions. |
Where SaaS Platforms Hit the Mixed Content Formats Ceiling
Most SaaS OTT platforms were built when horizontal was the only format that mattered. Their architecture reflects that. Adding vertical content to a platform built for horizontal is not a configuration change. It is a series of workarounds that accumulate into an experience that feels broken in both formats.
>>> Explore more:
- How to Design Onboarding for Low-Tech Users in Emerging Markets
- How Fast Can You Really Launch a Whitelabel Streaming App
Build Sequence: How to Add the Second Format Without Breaking the First
Most platforms launch in one format and add the second later. The sequence of that addition determines whether the result feels native or bolted-on.
| Stage | What to Build | What to Avoid |
| 1. Player first | Format-aware player with auto-detection. Test portrait and landscape across 10+ device types before any new-format content goes live. | Launching new format content before the player handles it correctly. The first impression is permanent. |
| 2. Ingestion pipeline | Format-specific transcode profiles and thumbnail pipelines. Do not route new format content through existing pipelines. | Applying horizontal transcode profiles to vertical content. Mobile delivery quality will be poor. |
| 3. Dedicated discovery surface | Launch a new format in its own section. Do not mix with the existing format in the primary feed until the library depth can sustain 3-5 sessions without repetition. | A ‘Shorts’ shelf with 4 clips. Insufficient depth makes the surface feel broken immediately. |
| 4. Format-specific monetization | Configure separate ad and paywall rules per format. Validate before launch. | A 3-episode horizontal paywall depth applied to 60-second vertical clips |
| 5. Segmented analytics | Add format dimension to all key metrics before launch. Segment completion, session length, and conversion by format from day one. | Reporting averaged cross-format metrics and making content decisions based on combined data. |
The question that determines whether your mixed-format platform works:
“Can a user move from a vertical clip to a related horizontal series and back without any interface friction, without a format explanation, and without the player misbehaving at any point in the journey?”
If yes, the architecture is working. If not, the gap is a product decision, and it is solvable with the right platform.
FAQ — Architecture and Product Decisions
Do we need to build two separate apps? One for horizontal, one for vertical?
No, and building two separate apps is the wrong approach. It splits your user base, doubles your maintenance overhead, and prevents cross-format discovery (a user who finds your platform through vertical clips discovering your horizontal series library is one of the highest-value conversion paths available). One app with format-aware architecture is the correct model. The navigation and player adapt to the format; the user does not have to.
Our current SaaS platform supports video upload in any aspect ratio. Doesn’t that mean it supports mixed content formats?
Accepting a video file in any aspect ratio is not the same as supporting mixed content formats. The question is what happens after upload: Does the player render vertical content correctly in portrait mode without letterboxing? Does the thumbnail pipeline generate format-appropriate crops? Are there separate discovery surfaces with format-specific UX conventions? Does the ad logic apply format-aware rules? If any of these answers is no, the platform accepts the file but does not support the format, and your users will feel the difference.
How much vertical content do we need before it is worth launching a vertical surface?
A minimum viable vertical library is approximately 20-30 pieces of distinct content before a dedicated vertical surface is worth surfacing to users. Below this threshold, the feed runs out of content within a short session, and the experience feels broken. If you have fewer than 20 vertical pieces, integrate them as a single shelf row on the home screen rather than a dedicated tab or surface. Launch the dedicated surface when the library depth can sustain at least 3-5 sessions without repetition for an average user.
What is the right monetization model for vertical content on a mixed-format OTT platform?
Ad-supported free access is the primary model for vertical (pre/post-roll only). Use coin unlock for premium vertical series arcs. In a mixed content format context, the strategic value of vertical is as a discovery funnel: a user who watches three vertical episodes of a drama is a high-conversion candidate for the full horizontal series on subscription. The cross-format conversion is the primary revenue value of the vertical library.
What platform architecture is required for mixed content formats at scale?
At minimum: a format-aware player SDK (single player, not two), format-tagged content pipelines with separate transcode profiles, dual thumbnail generation, format-segmented analytics, and configurable monetization rules per format. A white-label platform with customization capacity supports all of these as configuration decisions. A rigid SaaS platform requires workarounds at each layer, and those workarounds accumulate into an experience that feels unfinished in at least one format.






