This is Part 3 of our pipeline series. (See the Architecture Overview for context.) Having isolated independent .mp4 chunks in our local filesystem from Part 2, we must now parse metadata via OpenAI and concatenate the sequence using FFmpeg.
Series Navigation
- Part 1: Interfacing with the Apify Python SDK
- Part 2: Asynchronous video ingestion and connection pooling
- Part 3: LLM script synthesis and FFmpeg concatenation ← You are here
- Part 4: OAuth2 authentication and YouTube Data API uploads
Engineering dependencies
We interface with OpenAI’s asynchronous API wrapper and the command-line FFmpeg application. FFmpeg operations are delegated via Python’s subprocess for raw control over transcode flags.
pip install openai
Ensure the FFmpeg binary is installed on your OS environment (sudo apt install ffmpeg or brew install ffmpeg).
Prompt engineering the LLM component
We construct a narrative bridging the disparate .mp4 clips by pushing the accompanying desc metadata to an LLM context window.
# synthesize_script.py
import json
import os
from openai import OpenAI
SYSTEM_PROMPT = """
You are a factual, un-biased news synthesis engine. Given a JSON array of localized multimedia captions originating from social media on a specific geographic event, synthesize a tight 60-second broadcast script. Return raw text without markdown headers, HTML, commentary, or social media references.
"""
def synthesize_broadcast(api_key: str, topic: str, context: list[str]) -> str:
"""Invokes OpenAI ChatGPT to generate continuous editorial."""
client = OpenAI(api_key=api_key)
# Prune elements beyond context threshold
pruned_context = context[:20]
payload = f"Topic Context: {topic}\n\nIngestion Metadata:\n"
payload += "\n".join(f"- {c}" for c in pruned_context)
resp = client.chat.completions.create(
model="gpt-4o-mini",
temperature=0.3, # Low variance
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": payload}
]
)
return resp.choices[0].message.content
def synthesize_audio(api_key: str, script: str, out_path: str):
"""Invokes the neural TTS engine."""
client = OpenAI(api_key=api_key)
with client.audio.speech.with_streaming_response.create(
model="tts-1",
voice="onyx",
input=script,
) as response:
response.stream_to_file(out_path)
By strictly managing the temperature parameter and heavily biasing the systemic prompt against colloquialisms, we extract objective signals from unstructured social streams.
Demuxing and concatenating via FFmpeg
With narration.mp3 isolated, we construct an FFmpeg intermediate .txt concatenation file, enforce standardization via H.264 profiles (libx264), and map the audio stream natively, skipping arbitrary format transcodings.
# build_artifacts.py
import subprocess
from pathlib import Path
def concat_and_mux(video_paths: list[Path], audio_path: Path, out_path: Path):
"""
Standardizes aspect ratios, drops source audio channels, concatenates video tracks,
and muxes the secondary audio track.
"""
# 1. Build FFmpeg concat instruction file
concat_txt = Path("concat.txt")
with open(concat_txt, "w") as f:
for vp in video_paths:
f.write(f"file '{vp.absolute()}'\n")
# 2. Transcode parameters to normalize resolutions to 1080p 16:9
tmp_vid = Path("tmp_vid.mp4")
subprocess.run([
"ffmpeg", "-y", "-f", "concat", "-safe", "0",
"-i", str(concat_txt),
"-c:v", "libx264", "-crf", "24", "-preset", "veryfast",
"-r", "30", "-an", # Drop source audio track (-an)
"-vf", "scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2",
str(tmp_vid)
], check=True)
# 3. Mux audio and video tracks, truncate at shortest stream
subprocess.run([
"ffmpeg", "-y",
"-i", str(tmp_vid), "-i", str(audio_path),
"-c:v", "copy", "-c:a", "aac", "-b:a", "192k",
"-shortest",
str(out_path)
], check=True)
# Teardown artifacts
concat_txt.unlink()
tmp_vid.unlink()
This structural normalization guarantees that vertical TikTok codecs are padded appropriately without mutating hardware profiles, minimizing the container rendering overhead.
In Part 4: OAuth2 authentication and YouTube Data API uploads, we discuss orchestrating the final stage via the google-api-python-client.
Need more TikTok data inputs? You can also feed this pipeline with trending videos, hashtag data, or user profiles. See our full collection of TikTok and Twitter scraping tools for additional ingestion sources.
Series Navigation
- Part 1: Interfacing with the Apify Python SDK
- Part 2: Asynchronous video ingestion and connection pooling
- Part 3: LLM script synthesis and FFmpeg concatenation ← You are here
- Part 4: OAuth2 authentication and YouTube Data API uploads