适用于 Java 的 AWS 开发工具包
开发人员指南
AWS 文档中描述的 AWS 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 AWS 服务入门

使用 Amazon Transcribe

以下示例显示如何使用 Amazon Transcribe 进行双向流式处理。双向流式处理意味着数据流同时转向服务且实时接收回来。此示例使用 Amazon Transcribe 流式转录发送音频流并实时收回转录的文本流。

请参阅 Amazon Transcribe Developer Guide 中的流式转录以了解有关此功能的更多信息。

请参阅 Amazon Transcribe Developer Guide 中的入门以了解如何开始使用 Amazon Transcribe。

设置麦克风

此代码使用 javax.sound.sampled 软件包流式处理来自输入设备的音频。

代码

import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioSystem; import javax.sound.sampled.DataLine; import javax.sound.sampled.TargetDataLine; public class Microphone { public static TargetDataLine get() throws Exception { AudioFormat format = new AudioFormat(16000, 16, 1, true, false); DataLine.Info datalineInfo = new DataLine.Info(TargetDataLine.class, format); TargetDataLine dataLine = (TargetDataLine) AudioSystem.getLine(datalineInfo); dataLine.open(format); return dataLine; } }

请参阅 GitHub 上的完整示例

创建发布者

该代码实现一个发布者,用于发布来自 Amazon Transcribe 音频流的音频数据。

代码

import java.io.IOException; import java.io.InputStream; import java.io.UncheckedIOException; import java.nio.ByteBuffer; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicLong; import org.reactivestreams.Publisher; import org.reactivestreams.Subscriber; import org.reactivestreams.Subscription; import software.amazon.awssdk.core.SdkBytes; import software.amazon.awssdk.services.transcribestreaming.model.AudioEvent; import software.amazon.awssdk.services.transcribestreaming.model.AudioStream; public class AudioStreamPublisher implements Publisher<AudioStream> { private final InputStream inputStream; public AudioStreamPublisher(InputStream inputStream) { this.inputStream = inputStream; } @Override public void subscribe(Subscriber<? super AudioStream> s) { s.onSubscribe(new SubscriptionImpl(s, inputStream)); } private class SubscriptionImpl implements Subscription { private static final int CHUNK_SIZE_IN_BYTES = 1024 * 1; private ExecutorService executor = Executors.newFixedThreadPool(1); private AtomicLong demand = new AtomicLong(0); private final Subscriber<? super AudioStream> subscriber; private final InputStream inputStream; private SubscriptionImpl(Subscriber<? super AudioStream> s, InputStream inputStream) { this.subscriber = s; this.inputStream = inputStream; } @Override public void request(long n) { if (n <= 0) { subscriber.onError(new IllegalArgumentException("Demand must be positive")); } demand.getAndAdd(n); executor.submit(() -> { try { do { ByteBuffer audioBuffer = getNextEvent(); if (audioBuffer.remaining() > 0) { AudioEvent audioEvent = audioEventFromBuffer(audioBuffer); subscriber.onNext(audioEvent); } else { subscriber.onComplete(); break; } } while (demand.decrementAndGet() > 0); } catch (Exception e) { subscriber.onError(e); } }); } @Override public void cancel() { } private ByteBuffer getNextEvent() { ByteBuffer audioBuffer; byte[] audioBytes = new byte[CHUNK_SIZE_IN_BYTES]; int len = 0; try { len = inputStream.read(audioBytes); if (len <= 0) { audioBuffer = ByteBuffer.allocate(0); } else { audioBuffer = ByteBuffer.wrap(audioBytes, 0, len); } } catch (IOException e) { throw new UncheckedIOException(e); } return audioBuffer; } private AudioEvent audioEventFromBuffer(ByteBuffer bb) { return AudioEvent.builder() .audioChunk(SdkBytes.fromByteBuffer(bb)) .build(); } } }

请参阅 GitHub 上的完整示例

创建客户端和启动流

在 main 方法中,创建请求对象,启动音频输入流,并通过音频输入实例化发布者。

您还必须创建 StartStreamTranscriptionResponseHandler 以指定如何处理来自 Amazon Transcribe 的响应。

然后,使用 TranscribeStreamingAsyncClientstartStreamTranscription 方法启动双向流式处理。

导入

import javax.sound.sampled.AudioFormat; import javax.sound.sampled.AudioSystem; import javax.sound.sampled.DataLine; import javax.sound.sampled.TargetDataLine; import javax.sound.sampled.AudioInputStream; import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider; import software.amazon.awssdk.services.transcribestreaming.TranscribeStreamingAsyncClient; import software.amazon.awssdk.services.transcribestreaming.model.LanguageCode; import software.amazon.awssdk.services.transcribestreaming.model.MediaEncoding; import software.amazon.awssdk.services.transcribestreaming.model.StartStreamTranscriptionRequest; import software.amazon.awssdk.services.transcribestreaming.model.StartStreamTranscriptionResponseHandler; import software.amazon.awssdk.services.transcribestreaming.model.TranscriptEvent;

代码

public static void main(String[] args) throws Exception { TranscribeStreamingAsyncClient client = TranscribeStreamingAsyncClient.builder().credentialsProvider(ProfileCredentialsProvider.create()).build(); StartStreamTranscriptionRequest request = StartStreamTranscriptionRequest.builder() .mediaEncoding(MediaEncoding.PCM) .languageCode(LanguageCode.EN_US) .mediaSampleRateHertz(16_000).build(); TargetDataLine mic = Microphone.get(); mic.start(); AudioStreamPublisher publisher = new AudioStreamPublisher(new AudioInputStream(mic)); StartStreamTranscriptionResponseHandler response = StartStreamTranscriptionResponseHandler.builder().subscriber(e -> { TranscriptEvent event = (TranscriptEvent) e; event.transcript().results().forEach(r -> r.alternatives().forEach(a -> System.out.println(a.transcript()))); }).build(); client.startStreamTranscription(request, publisher, response).join(); }

请参阅 GitHub 上的完整示例

更多信息