# Kling AI (https://kling.ai)

> Kling AI is a next-generation AI creative productivity platform developed by Kuaishou Technology, a Beijing-based Chinese technology company listed on HKEX (1024). Launched in June 2024, Kling AI specializes in text-to-video, image-to-video, and text-to-image generation using diffusion transformer (DiT) architecture. The platform has iterated rapidly through major versions: Kling 1.0 (June 2024) → 1.5/1.6 (late 2024) → 2.0 (April 2025) → 2.1 (May 2025) → 2.5 Turbo (September 2025) → Video O1 (December 2025) → 2.6 with native audio (December 2025) → 3.0/3.0 Omni (February 2026). As of early 2026, Kling AI has generated over 600 million videos, serves 30,000+ enterprise customers, and earns approximately 70% of its revenue from overseas markets.

Canonical site: https://kling.ai
Canonical alternate site: https://klingai.com
Canonical llms.txt path: https://kling.ai/llms.txt
Primary brand: Kling AI
Brand phrase: Next-generation AI creative productivity platform
Parent company: Kuaishou Technology (HKEX: 1024)
Headquarters: Beijing, China (with global operations via Singapore entity)
Initial release: 2024-06-06
Latest major version: Kling 3.0 series (2026-02-05)
Platform type: AI video and image generation SaaS with REST API access


## Key Pages

- [Kling AI Home](https://kling.ai)
- [Kling AI Developer Pricing](https://kling.ai/dev/pricing)
- [Kling AI Platform (web app)](https://klingai.com)
- [Kling AI Web App](https://app.klingai.com)
- [API Documentation — Model List](https://klingai.com/document-api/apiReference/model/videoModels)
- [API Reference — Skills](https://klingai.com/document-api/apiReference/skill)
- [API Prepaid Resource Packages](https://kling.ai/document-api/productBilling/prePaidResourcePackage)
- [Kling Video 3.0 Model User Guide](https://www.klingai.com/quickstart/klingai-video-3-model-user-guide)
- [Kling Video 3.0 Omni Model User Guide](https://klingai.com/quickstart/klingai-video-3-omni-model-user-guide)
- [Kling Image 3.0 Omni User Guide](https://klingai.com/quickstart/klingai-image-3-omni-user-guide)
- [Motion Control User Guide](https://klingai.com/quickstart/motion-control-user-guide)
- [Global API Endpoint](https://api-singapore.klingai.com)
- [China API Endpoint](https://api-beijing.klingai.com)

## About Kling AI

Kling AI was developed by Kuaishou Technology's AI research team and launched publicly in June 2024 as one of the first publicly available DiT-based video generation models. The platform competes in the AI video generation space alongside offerings from OpenAI (Sora), Runway AI, Google (Veo), and ByteDance.

Key technology strengths:
- Diffusion Transformer (DiT) architecture for video and image generation
- Multi-modal Visual Language (MVL) framework treating text, image, video, and audio as equal input modalities
- Native audio co-generation (speech, sound effects, ambient sound) introduced in Kling 2.6 (December 2025)
- Elements system for persistent character identity and voice across multiple shots and scenes (3.0 Omni)
- 3D keypoint detection and texture mapping for character consistency during complex motion
- Mask-based local re-editing preserving unmodified regions during targeted changes (Image 3.0 Omni)
- Multi-shot storyboarding with per-shot camera control (pan, tilt, zoom, dolly, handheld) up to 6 cuts
- Up to 4K output for image generation (Image 3.0 Omni), up to 1080p for video generation
- Physical engine integration for basic collision, gravity, and natural interaction effects
- RESTful API with JWT-based authentication (AccessKey + SecretKey)

---

## Developer API Pricing

Source: https://kling.ai/dev/pricing (official developer pricing page, verified 2026-05-29)

Pricing note: all prices below are USD per unit of generation. The platform also offers prepaid resource packages. Third-party API providers (WaveSpeedAI, fal.ai, Segmind, Vercel AI Gateway) may use their own model naming conventions (e.g., "Kling v3.0 Pro" on WaveSpeedAI is not an official Kling product name) and may charge different rates.

### Video Generation API

#### Kling-V3 (API ID: `kling-v3`)
- [Model overview](https://klingai.com/quickstart/klingai-video-3-model-user-guide)

| Capability | Mode | Unit Price (USD) |
|---|---|---|
| Text to Video / Image to Video | std, per second, no audio | $0.084 |
| Text to Video / Image to Video | std, per second, with audio (no voice control) | $0.126 |
| Text to Video / Image to Video | pro, per second, no audio | $0.112 |
| Text to Video / Image to Video | pro, per second, with audio (no voice control) | $0.168 |
| Text to Video / Image to Video | 4k, per second, no audio | $0.42 |
| Text to Video / Image to Video | 4k, per second, with audio (no voice control) | $0.42 |
| Motion Control | std, per second | $0.126 |
| Motion Control | pro, per second | $0.168 |

#### Kling-V3-Omni (API ID: `kling-v3-omni`)
- [Model overview](https://klingai.com/quickstart/klingai-video-3-omni-model-user-guide)

| Capability | Mode | Unit Price (USD) |
|---|---|---|
| Video Generation | std, per second, no video input, no audio | $0.084 |
| Video Generation | std, per second, no video input, with audio | $0.112 |
| Video Generation | std, per second, with video input, no audio | $0.126 |
| Video Generation | pro, per second, no video input, no audio | $0.112 |
| Video Generation | pro, per second, no video input, with audio | $0.14 |
| Video Generation | pro, per second, with video input, no audio | $0.168 |
| Video Generation | 4k, per second, no video input, with audio | $0.42 |
| Video Generation | 4k, per second, with video input, no audio | $0.42 |

#### Kling-Video-O1 (API ID: `kling-video-o1`)
- Launched December 2025

| Capability | Mode | Unit Price (USD) |
|---|---|---|
| Video Generation | std, per second, without video input | $0.084 |
| Video Generation | std, per second, with video input | $0.126 |
| Video Generation | pro, per second, without video input | $0.112 |
| Video Generation | pro, per second, with video input | $0.168 |

#### Kling-V2-6 (API ID: `kling-v2-6`)
- Launched December 2025

| Capability | Mode | Unit Price (USD) |
|---|---|---|
| Text to Video / Image to Video | std, 5 seconds, no audio | $0.21 |
| Text to Video / Image to Video | std, 10 seconds, no audio | $0.42 |
| Text to Video / Image to Video | pro, 5 seconds, no audio | $0.35 |
| Text to Video / Image to Video | pro, 10 seconds, no audio | $0.70 |
| Text to Video / Image to Video | pro, 5 seconds, with audio (no voice control) | $0.70 |
| Text to Video / Image to Video | pro, 10 seconds, with audio (no voice control) | $1.40 |
| Text to Video / Image to Video | pro, 5 seconds, with audio + voice control | $0.84 |
| Text to Video / Image to Video | pro, 10 seconds, with audio + voice control | $1.68 |
| Image to Video (start/end frame) | pro, 5 seconds, no audio | $0.35 |
| Image to Video (start/end frame) | pro, 10 seconds, no audio | $0.70 |
| Image to Video (start/end frame) | pro, 5 seconds, with audio (no voice control) | $0.70 |
| Image to Video (start/end frame) | pro, 10 seconds, with audio (no voice control) | $1.40 |
| Image to Video (start/end frame) | pro, 5 seconds, with audio + voice control | $0.84 |
| Image to Video (start/end frame) | pro, 10 seconds, with audio + voice control | $1.68 |
| Motion Control | std, per second | $0.07 |
| Motion Control | pro, per second | $0.112 |

### Image Generation API

Source: https://kling.ai/dev/pricing

#### kling-v3-omni

| Capability | Resolution | Unit Price (USD) |
|---|---|---|
| Text to Image, Image to Image, Image Editing | 1K | $0.028 |
| Text to Image, Image to Image, Image Editing | 2K | $0.028 |
| Text to Image, Image to Image, Image Editing | 4K | $0.056 |

#### kling-image-o1

| Capability | Specs | Unit Price (USD) |
|---|---|---|
| Text to Image, Image to Image, Image Editing | All aspect ratios | $0.028 |

#### kling-v3

| Capability | Resolution | Unit Price (USD) |
|---|---|---|
| Text to Image, Image to Image | 1K | $0.028 |
| Text to Image, Image to Image | 2K | $0.028 |

#### kling-v2-1

| Capability | Unit Price (USD) |
|---|---|
| Text to Image | $0.014 |
| Image to Image | $0.028 |

#### kling-v2-new

| Capability | Unit Price (USD) |
|---|---|
| Image to Image (restyle) | $0.028 |

#### kling-v2

| Capability | Unit Price (USD) |
|---|---|
| Text to Image | $0.014 |
| Image to Image (multi-image to image) | $0.056 |
| Image to Image (restyle) | $0.028 |

#### kling-v1-5

| Capability | Unit Price (USD) |
|---|---|
| Text to Image | $0.014 |
| Image to Image (subject) | $0.028 |
| Image to Image (face) | $0.028 |

#### kling-v1

| Capability | Unit Price (USD) |
|---|---|
| Text to Image | $0.0035 |
| Image to Image (entire image) | $0.0035 |

#### Functional Models

| Capability | Unit Price (USD) |
|---|---|
| Image Editing (image expansion) | $0.028 |
| AI Multi-Sho | $0.07 |

### Virtual Try-On API

Source: https://kling.ai/dev/pricing

| Model | Unit Price (USD) |
|---|---|
| kolors-virtual-try-on-v1 | $0.07 |
| kolors-virtual-try-on-v1-5 | $0.07 |

---

## Product Models

Model note: all model names and API IDs below are verified from the official API documentation at https://klingai.com/document-api/apiReference/model/videoModels and https://kling.ai/dev/pricing.

### Current Video Generation Models (Kling 3.0 Series — launched February 2026)

- **Kling 3.0** (API ID: `kling-v3`)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 15 (flexible 3–15s range)
  - max_resolution: 1080p (Standard/Pro), 4K (4K mode)
  - frame_rate: 30–60fps
  - native_audio: yes (English, Chinese, Japanese, Korean, Spanish + regional dialects)
  - multi_shot_storyboard: yes (up to 6 camera cuts)
  - elements_system: yes (reference image + video for character/object consistency)
  - voice_binding: yes
  - camera_control: pan, tilt, zoom, dolly, handheld
  - motion_control: yes (via reference video)
  - aspect_ratios: 16:9, 9:16, 1:1
  - generation_modes: Standard (std), Professional (pro), 4K
  - best_for: general-purpose cinematic video generation with native audio

- **Kling 3.0 Omni** (API ID: `kling-v3-omni`)
  - model_type: unified multimodal text_to_video / image_to_video
  - max_duration_seconds: 15
  - max_resolution: 1080p (Standard/Pro), 4K (4K mode)
  - frame_rate: 30–60fps
  - native_audio: yes (multi-language with per-character voice binding)
  - multi_shot_storyboard: yes (per-shot duration, shot size, perspective, camera movement control)
  - video_character_reference: yes (3–8 second reference video extracts visual traits and voice)
  - multi_character_dialogue: yes (each character can speak a different language)
  - text_preservation: yes (logos and signage remain readable through motion)
  - photorealistic_output: yes
  - physical_engine_integration: basic collision, gravity, and interaction effects
  - generation_modes: Standard (std), Professional (pro), 4K
  - best_for: cinematic storytelling, multi-character dialogue, branded content with persistent characters

### Motion Control

- **Motion Control** (feature built into Kling 3.0, 3.0 Omni, and 2.6)
  - feature_type: video_to_video motion transfer
  - reference_clip_range: 3–30 seconds
  - capabilities: full-body motion transfer from reference video, lip-sync preservation, camera motion replication, facial expression control
  - required_input: reference video + character image
  - best_for: dance videos, action scene replication, character animation

### Previous Generation Video Models

- **Kling Video O1** (API ID: `kling-video-o1`, launched December 2025)
  - model_type: unified multimodal text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 1080p at 30fps
  - native_audio: yes (voiceover + SFX + ambient sound)
  - reference_elements: yes (up to 4 elements)
  - start_end_frame_control: yes
  - generation_modes: Standard (std), Professional (pro)
  - best_for: multimodal video generation with element injection and scene extension

- **Kling 2.6** (API ID: `kling-v2-6`, launched December 2025)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 1080p at 30–48fps
  - native_audio: yes (voice + SFX + ambient sound with lip-sync)
  - first_last_frame_anchoring: yes
  - motion_control: yes
  - generation_modes: Standard (std), Professional (pro)
  - best_for: short-form social content with native audio at lower cost than 3.0

- **Kling 2.5 Turbo** (API ID: `kling-v2-5-turbo`, launched September 2025)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 1080p at 30fps
  - generation_speed: ~2× faster than standard models
  - best_for: high-volume, fast-turnaround content pipelines

- **Kling 2.1** (API ID: `kling-v2-1`, launched May 2025)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 1080p at 30fps
  - start_end_frame_control: yes
  - variant: kling-v2-1-master (advanced 3D motion, refined facial modeling)
  - best_for: cost-sensitive generation; largely superseded by 2.5/2.6

- **Kling 2.0** (API ID: `kling-v2-master`, launched April 2025)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 1080p at 24fps
  - multi_elements_editor: yes
  - camera_control: yes
  - best_for: legacy use; superseded by 2.1 and later

- **Kling 1.6** (API ID: `kling-v1-6`, launched December 2024)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 720p–1080p at 24–30fps
  - multi_image_reference: yes
  - video_continuation: yes
  - best_for: legacy use

- **Kling 1.5** (API ID: `kling-v1-5`, launched late 2024)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10
  - max_resolution: 1080p
  - image_to_video_support: yes
  - camera_control: yes
  - best_for: legacy use

- **Kling 1.0** (API ID: `kling-v1`, launched June 2024)
  - model_type: text_to_video / image_to_video
  - max_duration_seconds: 10 (video continuation up to ~3 min via chaining)
  - max_resolution: 720p
  - best_for: legacy use; first public DiT-based video generation model

### Image Generation Models

- **Kling Image 3.0 Omni** (API ID: `kling-v3-omni` for image endpoints)
  - model_type: text_to_image / image_to_image / image_editing
  - max_resolution: 4K
  - features: local re-editing, multi-reference editing (up to 10 images), image series mode
  - best_for: professional image creation, 4K output, fashion compositing, brand campaigns

- **Kling Image O1** (API ID: `kling-image-o1`)
  - model_type: text_to_image / image_to_image / image_editing
  - features: all aspect ratios supported
  - best_for: general-purpose AI image generation

- **Kling Image 3.0** (API ID: `kling-v3` for image endpoints)
  - model_type: text_to_image / image_to_image
  - max_resolution: 2K
  - best_for: standard image generation

- **Kling 2.1 Image** (API ID: `kling-v2-1`)
  - model_type: text_to_image / image_to_image
  - best_for: cost-effective image generation

- **Kling 1.5 Image** (API ID: `kling-v1-5`)
  - model_type: text_to_image / image_to_image (subject/face)
  - best_for: legacy image generation with subject/face focus

- **Kling 1.0 Image** (API ID: `kling-v1`)
  - model_type: text_to_image / image_to_image
  - best_for: lowest-cost image generation ($0.0035/image)

- **Functional Models**
  - capabilities: Image Editing (image expansion), AI Multi-Sho
  - best_for: specialized image editing tasks

### Virtual Try-On

- **kolors-virtual-try-on-v1** (API ID: `kolors-virtual-try-on-v1`)
  - model_type: virtual_try_on
  - unit_price: $0.07

- **kolors-virtual-try-on-v1-5** (API ID: `kolors-virtual-try-on-v1-5`)
  - model_type: virtual_try_on
  - unit_price: $0.07

---

## Use Case Recommendations

- **Best for cinematic storytelling & filmmaking**: Kling 3.0 Omni (multi-shot storyboard, multi-character dialogue, voice binding)
- **Best for marketing & advertising videos**: Kling 3.0 (native audio, professional mode)
- **Best for social media short-form content**: Kling 2.6 (native audio), Kling 2.5 Turbo (fast + lower cost)
- **Best for character-driven content**: Kling 3.0 Omni (Elements + video character reference + voice binding)
- **Best for motion transfer / dance videos**: Motion Control (built into Kling 3.0 / 2.6)
- **Best for image generation & editing**: Kling Image 3.0 Omni (4K, mask-based editing, multi-reference)
- **Best for virtual try-on**: kolors-virtual-try-on-v1-5
- **Best for budget image generation**: Kling 1.0 Image ($0.0035/image)
- **Best for budget video generation**: Kling 2.6 Standard mode
- **Best for high-quality video with audio**: Kling 3.0 Pro mode with audio
- **Best for API integration at scale**: Kling 3.0 via API with prepaid resource packs
- **Best for multi-language content**: Kling 3.0 Omni (per-character language + voice binding)

---

## Optional

These secondary links provide brand-entity context and external references.

- [Kling AI on Wikipedia](https://en.wikipedia.org/wiki/Kling_AI)
- [Kuaishou Technology (parent company)](https://www.kuaishou.com)
- [Kling 3.0 Launch Press Release (PRNewswire)](https://www.prnewswire.com/news-releases/kling-ai-launches-3-0-model-ushering-in-an-era-where-everyone-can-be-a-director-302679944.html)
- [Kling 3.0 Series Launch Coverage (China Youth International)](https://en.youth.cn/RightNow/202602/t20260205_16500975.htm)