[ [LIB.AI.05] AI Tools ] AI Voice

AI Voice

> Speech-to-Text (STT) and Text-to-Speech (TTS) interfaces.

Overview

[ [0x[LIB.AI.05] STATUS ]

A dual-purpose voice interface for transcribing audio and generating speech. Features a real-time audio visualizer, multiple voice selections, and support for audio file uploads.

Features

[ [0x01] SPEECH_TO_TE ]

STATUS: READY

DESC: High-accuracy transcription for microphone input or file uploads.

[ [0x02] TEXT_TO_SPEE ]

STATUS: READY

DESC: Natural sounding voice generation with multiple accents.

[ [0x03] VISUALIZER ]

STATUS: READY

DESC: Real-time audio waveform visualization.

[ [0x04] MULTI_LANGUA ]

STATUS: READY

DESC: Support for transcription and synthesis in multiple languages.

Setup

[ [0x01] IMPORT COMPONENTS ]

DESC: Add the voice tools to your page.

1import { Card, CardHeader, CardContent } from "@/components/ui/card";
2import { Button } from "@/components/ui/button";
3import { Textarea } from "@/components/ui/textarea";

Usage

[ [0xAC] SPEECH-TO-TEXT ]

Transcribe audio files using OpenAI Whisper.

1const handleTranscribe = async (audioFile: File) => {
2  const formData = new FormData();
3  formData.append('audio', audioFile);
4
5  const response = await fetch('/api/ai/speech-to-text', {
6    method: 'POST',
7    body: formData,
8  });
9  const data = await response.json();
10  return data.text; // Transcribed text
11};

[ [0xD4] TEXT-TO-SPEECH ]

Generate speech using OpenAI TTS.

1const handleSpeak = async (text: string, voice: string) => {
2  const response = await fetch('/api/ai/text-to-speech', {
3    method: 'POST',
4    headers: { 'Content-Type': 'application/json' },
5    body: JSON.stringify({ text, voice }),
6  });
7  const blob = await response.blob();
8  const url = URL.createObjectURL(blob);
9  // Play or download the audio
10  const audio = new Audio(url);
11  audio.play();
12};

Configuration

[ [0xA0] CONFIG OPTIONS ]

Option	Type	Default	Description
voice	`"alloy" \| "echo" \| "fable" \| "onyx" \| "nova" \| "shimmer"`	`"alloy"`	OpenAI TTS voice selection.
audioFormats	`string[]`	`["mp3", "wav", "m4a", "webm"]`	Supported input formats for STT.
maxFileSize	`number`	`25MB`	Maximum audio file size for transcription.

Voice Options

[ [0xD8] AVAILABLE VOICES ]

ALLOY: Neutral, balanced voice
ECHO: Male voice
FABLE: British accent
ONYX: Deep male voice
NOVA: Female voice
SHIMMER: Soft female voice

Features

[ [0x01] SPEECH_TO_TE ]

STATUS: READY

DESC: High-accuracy transcription for microphone input or file uploads.

[ [0x02] TEXT_TO_SPEE ]

STATUS: READY

DESC: Natural sounding voice generation with multiple accents.

[ [0x03] VISUALIZER ]

STATUS: READY

DESC: Real-time audio waveform visualization.

[ [0x04] MULTI_LANGUA ]

STATUS: READY

DESC: Support for transcription and synthesis in multiple languages.

Usage

[ [0xAC] SPEECH-TO-TEXT ]

Transcribe audio files using OpenAI Whisper.

1const handleTranscribe = async (audioFile: File) => {
2  const formData = new FormData();
3  formData.append('audio', audioFile);
4
5  const response = await fetch('/api/ai/speech-to-text', {
6    method: 'POST',
7    body: formData,
8  });
9  const data = await response.json();
10  return data.text; // Transcribed text
11};

[ [0xD4] TEXT-TO-SPEECH ]

Generate speech using OpenAI TTS.

1const handleSpeak = async (text: string, voice: string) => {
2  const response = await fetch('/api/ai/text-to-speech', {
3    method: 'POST',
4    headers: { 'Content-Type': 'application/json' },
5    body: JSON.stringify({ text, voice }),
6  });
7  const blob = await response.blob();
8  const url = URL.createObjectURL(blob);
9  // Play or download the audio
10  const audio = new Audio(url);
11  audio.play();
12};

Option

Type

Default

Description

voice

"alloy"

OpenAI TTS voice selection.

audioFormats

string[]

["mp3", "wav", "m4a", "webm"]

Supported input formats for STT.

maxFileSize

number

25MB

Maximum audio file size for transcription.