2017-06-05 38 views
5

Używam IBM bluemix do transkrypcji dźwięku i chcę korzystać z rozpoznawania głośników API.Android Bluemix nie wyświetlający etykiety głośnika

I skonfigurować aparat rozpoznawania takiego:

private RecognizeOptions getRecognizeOptions() { 
    return new RecognizeOptions.Builder() 
      .continuous(true) 
      .contentType(ContentType.OPUS.toString()) 
      //.model("en-US") 
      .model("en-US_BroadbandModel") 
      .timestamps(true) 
      .smartFormatting(true) 
      .interimResults(true) 
      .speakerLabels(true) 
      .build(); 
} 

Ale wrócił JSON robi zawierać znacznik głośnika. W jaki sposób mogę uzyskać zwrot tagu głośnikowego za pomocą interfejsu API bluemix java?

Moja rejestrator audio w Androidzie wygląda następująco:

private void recordMessage() { 
    //mic.setEnabled(false); 
    speechService = new SpeechToText(); 
    speechService.setUsernameAndPassword("usr", "pwd"); 
    if(listening != true) { 
     capture = new MicrophoneInputStream(true); 
     new Thread(new Runnable() { 
      @Override public void run() { 
       try { 
        speechService.recognizeUsingWebSocket(capture, getRecognizeOptions(), new MicrophoneRecognizeDelegate()); 
       } catch (Exception e) { 
        showError(e); 
       } 
      } 
     }).start(); 
     Log.v("TAG",getRecognizeOptions().toString()); 
     listening = true; 
     Toast.makeText(MainActivity.this,"Listening....Click to Stop", Toast.LENGTH_LONG).show(); 
    } else { 
     try { 
      capture.close(); 
      listening = false; 
      Toast.makeText(MainActivity.this,"Stopped Listening....Click to Start", Toast.LENGTH_LONG).show(); 
     } catch (Exception e) { 
      e.printStackTrace(); 
     } 
    } 
} 
+0

Chyba masz na myśli, że powinien dodać tag mowy na tekst, a nie tekst na mowę;) –

+0

@bądź, jaki jest plik audio i metoda rozpoznawania, której używasz? czy korzystasz z WebSockets? –

+0

@GermanAttanasio Korzystam z interfejsu API audio watson dla Androida, zobacz mój zaktualizowany fragment kodu. – bear

Odpowiedz

0

Bazując na przykład napisałem przykładowej aplikacji i dostał etykiety głośnikowe do pracy.

Upewnij się, że używasz Java-SDK 4.2.1. W swojej build.gradle dodać

compile 'com.ibm.watson.developer_cloud:java-sdk:4.2.1' 

Oto fragment kodu, który rozpoznaje WAV file z folderu assets wykorzystaniem WebSockets, wyniki przejściowej i etykietę głośnika.

RecognizeOptions options = new RecognizeOptions.Builder() 
    .contentType("audio/wav") 
    .model(SpeechModel.EN_US_NARROWBANDMODEL.getName()) 
    .interimResults(true) 
    .speakerLabels(true) 
    .build(); 

SpeechToText service = new SpeechToText(); 
service.setUsernameAndPassword("SPEECH-TO-TEXT-USERNAME", "SPEECH-TO-TEXT-PASSWORD"); 

InputStream audio = loadInputStreamFromAssetFile("speaker_label.wav"); 

service.recognizeUsingWebSocket(audio, options, new BaseRecognizeCallback() { 
    @Override 
    public void onTranscription(SpeechResults speechResults) { 
     Assert.assertNotNull(speechResults); 
     System.out.println(speechResults.getResults().get(0).getAlternatives().get(0).getTranscript()); 
     System.out.println(speechResults.getSpeakerLabels()); 
    } 
}); 

Gdzie loadInputStreamFromAssetFile() jest:

public static InputStream loadInputStreamFromAssetFile(String fileName){ 
    AssetManager assetManager = getAssets(); // From Context 
    try { 
    InputStream is = assetManager.open(fileName); 
    return is; 
    } catch (IOException e) { 
    e.printStackTrace(); 
    } 
    return null; 
} 

logi aplikacji:

I/System.out: so how are you doing these days 
I/System.out: so how are you doing these days things are going very well glad to hear 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much just just myself and Chris now 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much just just myself and Chris now you had mentioned that %HESITATION okay 
I/System.out: so how are you doing these days things are going very well glad to hear I think I mentioned before that there's a company now that I'm working with which is very much just just myself and Chris now you had mentioned that %HESITATION okay 
I/System.out: [{ 
I/System.out: "confidence": 0.487, 
I/System.out: "final": false, 
I/System.out: "from": 0.03, 
I/System.out: "speaker": 0, 
I/System.out: "to": 0.34 
I/System.out: }, { 
I/System.out: "confidence": 0.487, 
I/System.out: "final": false, 
I/System.out: "from": 0.34, 
I/System.out: "speaker": 0, 
I/System.out: "to": 0.54 
I/System.out: }, { 
I/System.out: "confidence": 0.487, 
I/System.out: "final": false, 
I/System.out: "from": 0.54, 
I/System.out: "speaker": 0, 
I/System.out: "to": 0.63 
I/System.out: }, { 
...... blah blah blah 
I/System.out: }, { 
I/System.out: "confidence": 0.343, 
I/System.out: "final": false, 
I/System.out: "from": 13.39, 
I/System.out: "speaker": 1, 
I/System.out: "to": 13.84 
I/System.out: }]