7 Comments
User's avatar
Marko Konieczny's avatar

Hello Lucas, this is very interesting. Is it possible that you share the full code and you are talking about 2 options of getting the real time transcript can you share the second option.

Thanks in advanced.

Greetings Marko

Lucas Woodward's avatar

Hey Marko,

Thank you, I'm glad you found it interesting!

Unfortunately this post relates to a production system I created, which is still in heavy use. This means I'm unable to share all the code for it. However, if there are specific parts you're interested in understanding I can certainly explain my approach.

The two options for real-time transcripts I discussed were AWS Event Bridge (v2.conversations.{id}.transcription topic) and the Notifications API.

Marko Konieczny's avatar

Hi Lucas

Thank you for your response.

The use case we are currently exploring is to stream a transcript in real time to a third-party application. This third-party application would then determine the intent and, based on that, for example consult a knowledge base and present the outcome in a separate widget within Genesys.

I am aware that Agent Copilot can provide similar functionality, however, it is relatively limited in terms of integration with external knowledge bases. This is why the approach you mentioned, using the Notification API, seemed to be the most suitable solution for this scenario.

Genesys itself suggested the AudioHook Monitoring option, but this would require streaming the audio first and then converting it, which introduces a higher risk of errors.

It would be greatly appreciated if you could share how you implemented the integration using the Notification API and which topics you subscribed to.

Thank you in advance.

Greetings Marko

Lakshman's avatar

Hello Lucas, very interesting. Thanks for sharing detailed notes. What do you recommend if I have a requirement to derive intent basis of real time transcription and send that intent to customer systems to get respective knowledge information? Do you think NLU API within genesys cloud helps here?

Lucas Woodward's avatar

Thank you!

I do. If I was you I'd be starting with the endpoint below to detect intents from utterances in the live transcript:

/api/v2/languageunderstanding/domains/{domainId}/versions/{domainVersionId}/detect

I'd then be trying to understand whether intent should instead be derived from multiple utterances, since the live transcript is poorer quality than the post-call transcript, so it may require more context.

Depending on the outcome of that I may look at other intent classification models (balancing accuracy with latency).

KpS's avatar

Really interesting, do you mind share the full code ?

Lucas Woodward's avatar

Unfortunately I can't share all the code. Is there a specific part you're interested in?