DraganSr: 2024-08-24

Saturday, August 24, 2024

OpenAI API: text-to-speech

voices (alloy, echo, fable, onyx, nova, and shimmer)

openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision @GitHub

python

from pathlib import Path
from openai import OpenAI
client = OpenAI()

speech_file_path = Path(__file__).parent / "speech.mp3"
response = client.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
)

response.stream_to_file(speech_file_path)

node.js

import fs from "fs";
import path from "path";
import OpenAI from "openai";

const openai = new OpenAI();

const speechFile = path.resolve("./speech.mp3");

async function main() {
  const mp3 = await openai.audio.speech.create({
    model: "tts-1",
    voice: "alloy",
    input: "Today is a wonderful day to build something people love!",
  });
  console.log(speechFile);
  const buffer = Buffer.from(await mp3.arrayBuffer());
  await fs.promises.writeFile(speechFile, buffer);
}
main();

max txt len for OpenAI TTS API: 4096 chars

solution: create smaller mp3, then concatenate to single file, i.e. by using this module
fluent-ffmpeg - npm

this is turn requires a separate (CLI) app installed on the computer

About FFmpeg

nothing is simple...

Download FFmpeg

Builds - CODEX FFMPEG @ gyan.dev

node.js - In node, how does one concatenate mp3 files using streams - Stack Overflow

Android: upload multiple photos to google drive

pixel phone how to upload multiple photos to google drive - Google Search

use Google Drive App

then select multiple images from Gallery or Photos app, click "Done" and that is it.

best one when on WiFi