tag:blogger.com,1999:blog-37771252621431507212024-03-12T23:02:34.203-07:00Watson Tips and TricksSome advice on ways to get maximum advantage out of IBM Watson services.Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.comBlogger25125tag:blogger.com,1999:blog-3777125262143150721.post-23637019365925475222023-03-07T23:30:00.001-08:002023-03-07T23:30:00.155-08:00Deleting all documents from a Watson Discovery collection<p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz2UcUnyXBaR5S-gaCq3AFBjRIzkutH6vfyYZAue4eveU85nJ0YcYOtYlZAB2ztwaONehhHxNRLIqy4oUicOKMGg-HTOi6ogVtWo3R7iOyQO4CUJthYw_GMHRUcqJPOocrAwBWUJyWV7G39sBXmGQA1fjo9v-GW9YycpbwYAFOiLuBnsvi5inWGQE/s300/wds.jpeg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="168" data-original-width="300" height="168" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz2UcUnyXBaR5S-gaCq3AFBjRIzkutH6vfyYZAue4eveU85nJ0YcYOtYlZAB2ztwaONehhHxNRLIqy4oUicOKMGg-HTOi6ogVtWo3R7iOyQO4CUJthYw_GMHRUcqJPOocrAwBWUJyWV7G39sBXmGQA1fjo9v-GW9YycpbwYAFOiLuBnsvi5inWGQE/s1600/wds.jpeg" width="300" /></a></div><br />Sometimes when you are cleaning up a collection, you might want to delete all of the documents. It is tedious to do this by clicking on the files in the UI. Therefore I wrote a simple script <a href="https://www.dropbox.com/s/xyu7tf7m1rkqomv/del_docs.ipynb?dl=0">del_docs.ipynb</a> to do this for you.<p></p><p>To run this script, edit the panel containing the collection details and then jus run the notebook.</p><p>By default the script will delete all documents, if you want to only delete a subset of documents then alter the <i>query</i> variable to specify the selection formula for the documents you want to delete.</p>Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-71614157267026263012022-07-04T06:05:00.000-07:002022-07-04T06:05:20.702-07:00Automatically emailing a copy of conversation historyMany chatbots offer the option of emailing the user a copy of the conversation history. Traditionally this has been done by writing code server side, but with the latest version of the web widget associated with <a href="https://cloud.ibm.com/catalog/services/watson-assistant" target="_blank">Watson Assistant</a> it is really easy to implement this with a small bit of client side code.<div class="separator" style="clear: both; text-align: right;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHOGz4loNoke51xnhidGlZENo9_4X7Pu2vXkoRoEV1Pu8GnF49HwJNZyb41F9lHz4oUsU_xHf5FSvHs54YnDD8BnG95ji_-LWUWzOyfDBJ72QUkloLytnAnyvQhzROxqD7REi2HKSEmxK9u1Lvft3EW-IHTabEMd9T6WIDQ35y7aH7D2b0Dbuc9NE/s225/watson%20assistant%20logo.png" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="225" data-original-width="225" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHOGz4loNoke51xnhidGlZENo9_4X7Pu2vXkoRoEV1Pu8GnF49HwJNZyb41F9lHz4oUsU_xHf5FSvHs54YnDD8BnG95ji_-LWUWzOyfDBJ72QUkloLytnAnyvQhzROxqD7REi2HKSEmxK9u1Lvft3EW-IHTabEMd9T6WIDQ35y7aH7D2b0Dbuc9NE/w200-h200/watson%20assistant%20logo.png" width="200" /></a></div><div><br /></div>The code to implement this can be downloaded from <a href="https://github.com/bodonova/chat-history-email" target="_blank">GitHub</a>. What you need to do is:<p></p><ol style="text-align: left;"><li>Place the code contained in the history_template.html file somewhere in the header part of your web site's page. </li><li>Modify the INTEGRATION_ID, REGION_ID and INSTANCE_ID variables to match your Watson Assistant instance.</li><li>Create an account on elasticmail.com if you don't already have one. Then insert the SMTP_HOST, SMTP_USERNAME, SMTP_PASSWORD and FROM_ADDRESS in the javascript code.</li></ol><div>The code is fairly easy to understand, but here are a few pointers:</div><div><ul style="text-align: left;"><li>The start and end of the code block is the same as the web widget which is provided for you by the Watson Assistant UI. In fact accessing this widget is the easiest way to get the correct values for the <i>integrationID:</i>, <i>region:</i>, and <i>serviceInstanceID:</i> variables.</li><li>The code registers a number of <a href="https://web-chat.global.assistant.watson.cloud.ibm.com/docs.html?to=api-events" target="_blank">event handlers</a>:</li><ul><li>The <i>histHandler()</i> function gets called when the chat window is opened. The event passed contains the history which has been stored client side. This is used to construct the initial email contents and store it in the global variable <i>historyText</i>.</li><li>The <i>msgSent()</i> and <i>msgReceived() </i>functions are called when a message is sent by the end user and by Watson respectively. They update the global variable <i>historyText</i>.</li><li>Last, but not least the <i>msgSent()</i> event handler gets called whenever the chat window is closed/minimised. This calls the <i>sendEmail()</i> function described below.</li></ul><li>The <i>sendEmail()</i> function implements the actual sending of the email. The documentation for this javascript library suggests that any arbitrary SMTP server can be used, but I found that it wouldn't work with any SMTP agent apart from <i>smtp.elasticemail.com</i>. The value of the <i>From:</i> variable indicates where the email will appear to be coming from - this will have to be the email address you used to create the account on ElasticMail. Be careful what address you use - this is where any replies will be sent.</li></ul><div>If you want to customise the look and feel of the chat history emails, edit the variable <i>baseHistoryText </i>and/or change the <i>watsonMsg()</i> and <i>userMsg()</i> functions.</div></div><div><br /></div><div>I hope you find this sample useful. As always this code is provided on a As-Is basis. No warranty or support is offered.</div><div><br /></div>Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com1tag:blogger.com,1999:blog-3777125262143150721.post-67486347022695836482020-05-21T10:45:00.000-07:002020-05-21T10:45:02.916-07:00Video Tutorials on chatbot MakingI have started making videos explaining the chatbot making process. These are not Watson specific. More the steps needed in general.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe width="320" height="266" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/HLg0x7QZgxc/0.jpg" src="https://www.youtube.com/embed/HLg0x7QZgxc?feature=player_embedded" frameborder="0" allowfullscreen></iframe></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-53856313864295781642020-05-08T08:52:00.000-07:002020-05-08T08:52:45.251-07:00Understanding Intents and EntitiesOne of the most important things that any conversational agent needs to do is to figure out what intent(s) and entities are contained in the user's input. An intent is what the user wants to do and the entities are the things that they want to do it with or too. This can be confusing so this post tries to explain with a simple example.<br />
<br />
To illustrate we will consider a chatbot written for a second hand electronics store. The samples of expected user input might be something like:<br />
<br />
<ol>
<li>I want to sell my old iPhone</li>
<li>I need a new laptop</li>
<li>I am interested in buying an new iPad</li>
</ol>
<div>
In examples 2 and 3 the user's intent is to buy, while in example statement number 1 the intent is to sell. When designing a conversational agent you define the intents by giving examples of what you expect users with this intent to say. In a real system you would need to give many more examples to give Watson a better chance of guessing the users' intent, but this is enough to illustrate the way it works..</div>
<div>
<br /></div>
<div>
There are different ways to specify entities. The most common one is to give samples of the entity values. For example, if we had an entity @device and the possible values were 'phone', 'tablet' and 'laptop' - we would specify the entity as shown below by giving examples of what the user might say for each type of entity.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinJuYMCECiX1Rzv2iLTxsLQHapdDxnNiPx4pRp5epLHQAbd3d6wg_Q4rdsvGg39T99vxWb7T9lmB-XitfBTYT5LTIOceriQ_QpV8dXkyQdDGcWjEL2p4Hr2Mr4tR9y4neWQVR7Q3UlVoI/s1600/entites.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="323" data-original-width="1600" height="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinJuYMCECiX1Rzv2iLTxsLQHapdDxnNiPx4pRp5epLHQAbd3d6wg_Q4rdsvGg39T99vxWb7T9lmB-XitfBTYT5LTIOceriQ_QpV8dXkyQdDGcWjEL2p4Hr2Mr4tR9y4neWQVR7Q3UlVoI/s400/entites.png" width="400" /></a></div>
<div>
<br /></div>
<div>
The second way of specifying an entity is with a regular expression. This is useful in cases where you want to catch an email address or phone number. In this cases it is not feasible to exhaustively list all possible inputs, but the rules for what an email address or phone number look like are easy to specify in a regular expression.</div>
<div>
<br /></div>
<div>
The third option for entities is to use one of the predefined system entities. For example, you might use the @sys-date entity a simple way to capture mentions of a date without having to go to the trouble of specifying a complex regular expression. This also has advantages such as when the user types 'tomorrow' the entity extracted is the correct date for the day after they typed it.</div>
<div>
<br /></div>
<div>
Many developers of chatbots don't realise that you can combine entities with intent examples to make them more powerful. If you specify your examples like this:</div>
<br />
<br />
<ol>
<li>I want to sell my old @device</li>
<li>I need a new @device</li>
<li>I am interested in buying a new @device</li>
</ol>
This saves you the trouble of repeating the same sentence for each type of device that the users might want to buy or sell.<br />
Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-91650301315152756172019-09-11T08:47:00.002-07:002019-09-11T08:47:13.602-07:00Convert Watson csv's to RASA ymlWatson allows you to export the ground truth (questions and intents) <a href="https://rasa.com/">RASA</a> expects data in a slightly different format.<br />Watson is in the form<br />
<blockquote class="tr_bq">
Question, Intent<br />Can I buy a sandwich?, #buy_sandwich</blockquote>
<br />
<br />
RASA in the form<br />
<blockquote class="tr_bq">
## intent:buy_sandwich<br />- Can I buy a sandwich?</blockquote>
<br />
This is yml format but RASA calls it .md. The code for this conversion from Watson format to RASA is below<br />
<blockquote class="tr_bq">
<br /></blockquote>
<blockquote class="tr_bq">
import pandas as pd</blockquote>
<blockquote>
questions = pd.read_excel(filePath,names=['Intent','Question'])</blockquote>
<blockquote class="tr_bq">
file = open("rasaoutput.md","w", encoding="utf-8") </blockquote>
<blockquote>
</blockquote>
<blockquote>
for x in labels:<br /> #print the intention name in rasa way<br /> file.write("## intent:"+x+"\n")<br /> #then print every question in the dataset with that label<br /> <br /> y=questions[questions['Intent'].str.contains(x)]['Question']<br /> #change series into an array<br /> z=y.values<br /> i = 0<br /> while i < len(z):<br /> file.write(str("- "+z[i]+"\n"))<br /> i=i+1</blockquote>
<br />
-David<br />
<div>
<br /></div>
Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-84850230609629583812019-08-12T05:36:00.000-07:002019-08-12T05:36:26.162-07:00Migrating your Watson Assistant workspace to WebHooks<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3xOXp42WqVwfSivk7LJt0MHsakGgeCCB2p-gM3V7_18XrbWw0c4Yriw_gTP8Jxr4JSEMIN1j2nwuxy_iBZ73iCOuVtUxq5vjz_WsTbw2PgBojUqvqrenjKTIoYrKCtSZ9XGxYwBYhiKU/s1600/outgoing-webhook_512.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="512" data-original-width="512" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3xOXp42WqVwfSivk7LJt0MHsakGgeCCB2p-gM3V7_18XrbWw0c4Yriw_gTP8Jxr4JSEMIN1j2nwuxy_iBZ73iCOuVtUxq5vjz_WsTbw2PgBojUqvqrenjKTIoYrKCtSZ9XGxYwBYhiKU/s200/outgoing-webhook_512.png" width="200" /></a></div>
In <a href="https://watson-tricks.blogspot.com/2019/08/what-is-benefit-of-new-webhooks-feature.html">the last post</a> we examined the differences between webhooks and the old way (sometimes called web actions) that IBM Watson assistant called REST functions through the use of <a href="https://cloud.ibm.com/docs/openwhisk">IBM Cloud Functions</a>. In this post we will look at a simple example of migrating from one to the other.<br />
<br />
There is a <a href="https://developer.ibm.com/recipes/tutorials/connect-watson-assistant-with-wikipedia-api-via-cloud-functions/">tutorial</a> on DeveloperWorks which guides you through all of the steps to connect your Watson Assistant skill to the Wikipedia API using the old style mechanism. In this blog post we will assume that you have already gone through the steps in the original tutorial and we will describe how you can convert your workspace to use the newly released <a href="https://cloud.ibm.com/docs/services/assistant?topic=assistant-dialog-webhooks">webhooks</a> feature.<br />
<br />
The first step in the migration is that you need to make sure that the Cloud Function you created to lookup Wikipedia call able to be called externally.<br />
<br />
<div>
If the function definition is associated with a resource group, the security model will make it hard to be called, so you need to ensure that your function definition it is associated with a Cloud Foundry space. You can tell if your selected namespace is IAM based or Cloud Foundry based because of the drop-down selector at the top of the page will say (CF-Based) e.g.:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-zp1xUgdGKlscx2dmkMTpBr9Lum0ZsUEOr5q1Qt7LlMkm5n7oMnmsam_-SCX3EXBkySP6g0lgaK891lS5f5WbaQf-g16Q8qqfUGTa7G6OstpFeAIWQPw1tfPE4yoUpdTCyrfOd4QTjtk/s1600/cf-namespace-drop-down.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="80" data-original-width="328" height="76" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-zp1xUgdGKlscx2dmkMTpBr9Lum0ZsUEOr5q1Qt7LlMkm5n7oMnmsam_-SCX3EXBkySP6g0lgaK891lS5f5WbaQf-g16Q8qqfUGTa7G6OstpFeAIWQPw1tfPE4yoUpdTCyrfOd4QTjtk/s320/cf-namespace-drop-down.png" width="320" /></a></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjiICKmt_IK-pUQOu1k4LHVEBqssadfPyJ-1-53sv4HBfpm9cuAM7hCTmUJs0gOsa98za9Zhg3WjU4YVZbQCdwHDmJ1zPp1Ynej7TG8wXAYaAPf3_4B24ftlAyCOY4g_cV5Iqo04lqM0c/s1600/iam-namespace-drop-down.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="80" data-original-width="328" height="76" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjiICKmt_IK-pUQOu1k4LHVEBqssadfPyJ-1-53sv4HBfpm9cuAM7hCTmUJs0gOsa98za9Zhg3WjU4YVZbQCdwHDmJ1zPp1Ynej7TG8wXAYaAPf3_4B24ftlAyCOY4g_cV5Iqo04lqM0c/s320/iam-namespace-drop-down.png" width="320" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
If your function is defined in a IAM resource group, the easiest way to move it is to create a new function (with the exact same code) in a Cloud Foundry space i.e. switch to the new space with the drop-down and then create the function as described in the original tutorial.</div>
<div>
<br /></div>
<div>
After you have saved the Action you should try it out by changing the <i>object_of_interest</i> to different things you might be interested in looking up and then see what the function returns.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcNWpR7Y6_7Qlf7Yojfl-pfdjfvHP58RmgafLz4uZajV2fyHZxG0D6EkwVihZYMhyphenhyphen_zModyxalZdmKyCoy2UZmy9ZMISYdxr1-LRLZfvxSzYqT3TUb8c_M55wtvOXf3Wpu6PlSKUkJrZQ/s1600/invoke-button.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="297" data-original-width="487" height="195" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcNWpR7Y6_7Qlf7Yojfl-pfdjfvHP58RmgafLz4uZajV2fyHZxG0D6EkwVihZYMhyphenhyphen_zModyxalZdmKyCoy2UZmy9ZMISYdxr1-LRLZfvxSzYqT3TUb8c_M55wtvOXf3Wpu6PlSKUkJrZQ/s320/invoke-button.png" width="320" /></a></div>
<div>
<br />
Since WebHooks doesn't interact directly with Cloud Functions as such, you will need to ensure that your action is turned into a WebAction which can be called by any REST client. To do this, click on the <i>EndPoints</i> link in the left margin. This will give you the option to make your action invokable by a REST URL.</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgJj6YXlpWdc4EowLANXhs1IYpUK1-r82M6WQdN9i_WtZZa9kHBNQbV_vjIZGklWqgxVlvC-d3fjZx0J-6pFX8JS8QerKaSkOHbVC80sXoONYrxriKX0s_wP1oXhKQYiPQws2MCKNtO2g/s1600/Screenshot+2019-08-08+at+11.31.12.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="156" data-original-width="1121" height="89" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgJj6YXlpWdc4EowLANXhs1IYpUK1-r82M6WQdN9i_WtZZa9kHBNQbV_vjIZGklWqgxVlvC-d3fjZx0J-6pFX8JS8QerKaSkOHbVC80sXoONYrxriKX0s_wP1oXhKQYiPQws2MCKNtO2g/s640/Screenshot+2019-08-08+at+11.31.12.png" width="640" /></a></div>
<div>
<br />
Once you do this, you will see a curl command which can be used to invoke the web action. Initially the screen only shows <i>API-KEY</i> rather than the actual API key assigned to you. Click on the eye icon on the right to display the fully correct curl command.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNv5PkeF7cPsZKNL6d7mYvZX7D4Veb7s0xUQ4YAyaDAIVs9tMCeocpUyMYQPul5msAAsXm2Q-OI2nFrjDetBqzeJs6Mo0Somhw6hyeF9Q4t7Sxe9c3UKz3eOIIdICQYBO_e2AQGTPTc7k/s1600/curl-with-apikey.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="125" data-original-width="1195" height="66" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNv5PkeF7cPsZKNL6d7mYvZX7D4Veb7s0xUQ4YAyaDAIVs9tMCeocpUyMYQPul5msAAsXm2Q-OI2nFrjDetBqzeJs6Mo0Somhw6hyeF9Q4t7Sxe9c3UKz3eOIIdICQYBO_e2AQGTPTc7k/s640/curl-with-apikey.png" width="640" /></a></div>
<div>
<span style="color: #0000ee;"><span style="color: black;"><br /></span></span>If you have curl installed you can copy this command and execute it in your command line window, However, you will get an error because you haven't given any input parameters. To solve this add more command line options to specify the object of interest and the fact that the data you are supplying in in JSON format. e.g.:<br />
<blockquote class="tr_bq">
<span style="color: blue; font-family: "courier new" , "courier" , monospace; font-variant-ligatures: no-common-ligatures;">curl -u 3a686c56-12fc-4bd9-8a08-55317fec468d:CE4A88p1qGYV69dF43iVENNn3Ok6DHdtlYz7tlrCh0yG7aNRvUgzcHHBNJxi15z9 --header "Content-Type: application/json"</span><span class="Apple-converted-space" style="color: blue; font-family: "courier new" , "courier" , monospace; font-variant-ligatures: no-common-ligatures;"> </span><span style="color: blue; font-family: "courier new" , "courier" , monospace; font-variant-ligatures: no-common-ligatures;">--data "{\"object_of_interest\": \"love\"}" -X POST https://us-east.functions.cloud.ibm.com/api/v1/namespaces/brian_odonovan_bod-space/actions/Assistant-Functions/Wikipedia-Lookup?blocking=true</span></blockquote>
</div>
<div>
If you prefer using another tool like <a href="https://www.getpostman.com/">POSTMAN</a>, you can easily convert this command to suit. The one thing you need to be aware of is the fact that the authorisation you supply with the -u parameter to the curl command consists of two parts - the part before the colon is effectively a username and the part after the colon is the password.</div>
<div>
<br /></div>
<div>
Once you have verified that you WebAction is working correctly, you next need to change the Watson Assistant skill to use webhooks when calling Wikipedia. You do this by clicking on the options tab when editing the dialog and then selecting the <i>Webhooks</i> option on the left (it should be selected by default) and then entering details of the URL you want to call, what credentials to use and any other headers you want to pass to the function.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvTZAYlnf0vhRjjMH3gQTUVheBiwpz7zUzANYpJDRt8M8xZbG4BYmBy_2Zy-0gfb5pQFHEJtaFW_98IceNqw1e33zCgHqyWSVLAmkHaTleMLhdRaOe2AKkk7F6TG2sg0WxIBGg3jYSu_0/s1600/webhook-options.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" data-original-height="769" data-original-width="1021" height="482" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvTZAYlnf0vhRjjMH3gQTUVheBiwpz7zUzANYpJDRt8M8xZbG4BYmBy_2Zy-0gfb5pQFHEJtaFW_98IceNqw1e33zCgHqyWSVLAmkHaTleMLhdRaOe2AKkk7F6TG2sg0WxIBGg3jYSu_0/s640/webhook-options.png" width="640" /></a><br />
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<br />
You can use the URL and credentials from the curl command that you got from the web actions page described above. You might be slightly worried that there is a single URL assigned to a skill because in some cases where you might need to access services from different sites. There are ways of getting around this limitation, but I won't describe them here since our use case doesn't need to connect to multiple services. The next blog post in this series will describe in detail how you can connect to multiple REST services from a single WA workspace..</div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
The Dialog node which implements the interface to Wikipedia through Webhooks will be significantly different from the old one, therefore I suggest that you either delete or disable the old node. For example you could rename the node to <i>Old Wikipedia</i> and disable it my changing the match condition to <i>false</i> as illustrated below.</div>
<div style="text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFtN1vTF5OxL8GRS_iHmrvDsri-QmFhoYz2xGUZ_aor22haT2JhuaiuX8pgfElMTKL1VesV0BVat7si_OXdngXmHZiCSKeuL5IwlA5J-XLjA3oUU8r9Pz4ssphV58fCLkbyLP_TZMplHM/s1600/disable-dialog-node.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em; text-align: left;"><img border="0" data-original-height="228" data-original-width="232" height="196" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgFtN1vTF5OxL8GRS_iHmrvDsri-QmFhoYz2xGUZ_aor22haT2JhuaiuX8pgfElMTKL1VesV0BVat7si_OXdngXmHZiCSKeuL5IwlA5J-XLjA3oUU8r9Pz4ssphV58fCLkbyLP_TZMplHM/s200/disable-dialog-node.png" width="200" /></a></div>
<div style="text-align: left;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
Now you have to create a new Dialog for calling the webhook. You should call the node <i>Wikipedia</i> or something similar and make sure it is activated whenever the <i>#tell_me_about</i> intent is detected. Next click on the 'customize' icon and this will give you an option to turn on Webhooks for this node.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9gdt6C8kiqCZdA8qpAMyafr-vjdmVacGOZj3LKnyFjYU8edfHmE7IPC-tUITjjucJqtxVWOmvf4hUhY79JfjEasGnHAWaKKH5JRU96SZyoY1M9eHuOY5o1Gn8MGSaULKvwslCQq9gDm4/s1600/webhooh-enable-dialog.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="443" data-original-width="681" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9gdt6C8kiqCZdA8qpAMyafr-vjdmVacGOZj3LKnyFjYU8edfHmE7IPC-tUITjjucJqtxVWOmvf4hUhY79JfjEasGnHAWaKKH5JRU96SZyoY1M9eHuOY5o1Gn8MGSaULKvwslCQq9gDm4/s320/webhooh-enable-dialog.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
As soon as you close the customize dialog you will see that you see additional UI elements which parameters you would like to pass to the REST call and what context variable you would like to use to store the response.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5ri67UA46Segi0fy4xQi4VpMoiM5qBSs9uV_uWl9VsR0dC8BNC4Mm1A7D-Pgs8aeQhUfDa52LapXlTZcgg8am0ezGZHIDOZPUWcJF4iH24r-3VcGqHG04qtNv0ZMIF1klo_RQyrhq9lQ/s1600/webhook-parameters.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="395" data-original-width="755" height="207" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5ri67UA46Segi0fy4xQi4VpMoiM5qBSs9uV_uWl9VsR0dC8BNC4Mm1A7D-Pgs8aeQhUfDa52LapXlTZcgg8am0ezGZHIDOZPUWcJF4iH24r-3VcGqHG04qtNv0ZMIF1klo_RQyrhq9lQ/s400/webhook-parameters.png" width="400" /></a></div>
<br />
You will also see that the node has been converted into a multi condition response node and it pre-configures two output slots for what to say when your REST call succeeded (i.e. when the context variable has been set) or when the variable wasn't set (which probably indicates a network error or something similar).<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlUUvn5aoiZN7Fp3OvkWll0eAP9YPAx40T7NSZ1Vs2WEdSPYFm_J0gVk_B2TH81_s9j3vd4e-wrdO49IMlA_DVCK3xYsMR6J5Lka_aUhMtZwN3jKIfIC0XjwP-tZt4eZ9v_s5CCJXqwAg/s1600/webhook-response.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="361" data-original-width="748" height="192" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlUUvn5aoiZN7Fp3OvkWll0eAP9YPAx40T7NSZ1Vs2WEdSPYFm_J0gVk_B2TH81_s9j3vd4e-wrdO49IMlA_DVCK3xYsMR6J5Lka_aUhMtZwN3jKIfIC0XjwP-tZt4eZ9v_s5CCJXqwAg/s400/webhook-response.png" width="400" /></a></div>
<br />
<br />
You can use the same responses as in the in the original tutorial since the format of the response won't have changed. You can now test your application and see that it behaves more or less as before.<br />
<br />
There are two things that you should note about the way that Watson Assistant Webhooks work:<br />
<br />
<ol>
<li>We specified that you add a parameter named <i>object_of interest</i> and set its value to the contents of the <i>@object_of_interest</i> entity (lets assume that you asked "what is love" so the value will be "love" ). <br /><br />Normally when people say that they are adding parameters to a POST call they mean that they are adding a header with the value "<i>object_of interest: love</i>". However, this is not what Watson Assistant does. Instead it sends a JSON body with each of the parameter values e.g. <span style="color: blue;">{"object_of_interest": "love"}</span>.<br /><br />This is actually a better thing to do, but make sure you don't get confused by the terminology in the documentation.</li>
<li>Watson Assistant tells you that it stores the response from the REST call in the context variable you specify, but this is not exactly what it does. While the Webhooks functionality is not totally tied to the Cloud Functions, it does make certain assumptions based upon the way Cloud Functions operate.<br /><br />The response from a call to a Cloud Function will contain lots of information about the call other than just the response from the REST service called. It look something like: </li>
</ol>
<blockquote class="tr_bq">
<span style="color: blue; font-family: "courier new" , "courier" , monospace;">{</span></blockquote>
<blockquote class="tr_bq">
<span style="color: blue; font-family: "courier new" , "courier" , monospace;"> "activationId": "xxx",<br /> "end": </span><span style="color: blue; font-family: "courier new" , "courier" , monospace;"><time_stamp></span><span style="color: blue; font-family: "courier new" , "courier" , monospace;">,<br /> "start": </span><span style="color: blue; font-family: "courier new" , "courier" , monospace;"><time_stamp></span><span style="color: blue; font-family: "courier new" , "courier" , monospace;">,<br /> "response": {<br /> "result": { ...},<br /> "status": "success",<br /> "success": true<br /> },</span></blockquote>
<blockquote class="tr_bq">
<span style="color: blue; font-family: "courier new" , "courier" , monospace;"> ...<br />}</span><br />
<br />When you IBM Cloud Function returns, the data returned by the REST service is contained in the <i>response.result</i> part of the JSON structure retuned. If you are not using Cloud Functions, make sure you follow this convention because Watson Assistant will be expecting it. Similarly, you should also set the <i>response.sucsess</i> variable to the value <i>true</i> because otherwise Watson Assistant will assume that the call has failed.</blockquote>
<br />
<ol>
</ol>
</div>
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000}
span.s1 {font-variant-ligatures: no-common-ligatures}
</style>Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-75242200853309223932019-08-08T02:42:00.000-07:002019-08-08T02:42:00.454-07:00What is the benefit of the new webhooks feature in Watson Assistant<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjK8HCrUwBQUuaugDUIE3Wh4H6-Bx5dfT6yCeG7qOguwPWWDFGP5j0AVb0rpF5T0piydcpNIrTUZw0pbprxCMBCvRWM6Ur4S6SgKxtjzi1kpYNeTJFLDNX9yv0ac1xBldpXYpmzWGQBf0c/s1600/icons8-rest-api-512.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="500" data-original-width="500" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjK8HCrUwBQUuaugDUIE3Wh4H6-Bx5dfT6yCeG7qOguwPWWDFGP5j0AVb0rpF5T0piydcpNIrTUZw0pbprxCMBCvRWM6Ur4S6SgKxtjzi1kpYNeTJFLDNX9yv0ac1xBldpXYpmzWGQBf0c/s200/icons8-rest-api-512.png" width="200" /></a></div>
When building an AI chatbot it is impossible to incorporate all knowledge directly in your skill. As a result developers often find themselves wanting to call external functions to answer certain queries. IBM has responded to this requirement by supporting the <a href="https://cloud.ibm.com/docs/services/assistant?topic=assistant-dialog-actions">calling of cloud functions from within a Dialog node</a> in Watson Assistant.<br />
<br />
While developers have found this useful, they have also complained that it is inflexible and not so easy to use. To answer these complaints, IBM has recently released a new feature called <a href="https://cloud.ibm.com/docs/services/assistant?topic=assistant-dialog-webhooks">webhooks</a>. This feature was available in limited Beta for several months, but has just been released generally so now is a good time to look at it.<br />
<br />
This table summarises the differences between the two mechanisms:<br />
<br />
<table border="1" style="width: 100%;">
<tbody>
<tr>
<th>Aspect</th>
<th>webhooks</th>
<th>old way</th>
</tr>
<tr>
<td width="70px">URL Flexibility</td>
<td>With webhooks you can call any arbitrary URL. This means that you are not tied to using IBM Cloud Functions as an intermediate layer, Of course Watson Assistant will always make a POST call to your URL and supply the parameters in JSON. If the REST API you want to call does not accept this, you will need some mechanism to transform the call. However, you are free to choose any transformation tool that you want.</td>
<td>With the old mechanism, you could only call a Cloud Function which is defined in the same environment as the Watson Assistant instance which is doing the calling. This was quite restrictive and although it worked OK with Cloud Foundry based authentication, it was not really compatible with the new IAM style resource group authentication currently used in the IBM cloud.</td>
</tr>
<tr>
<td>UI Assistance</td>
<td>There is a UI to guide you in defining the authorisation and other parameters for your call to a webhook. This makes it quite user friendly.</td>
<td>The way that you specified that a REST call should be made is by editing the JSON response from a node to include an action parameter. Apart from the documentation there was no assistance to developer to define this correctly.</td>
</tr>
</tbody>
</table>
<br />
<br />
Now that we have compared the two mechanisms, our next blog post will look at a simple example of migrating from one to the other.<br />
<br />
<br />Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-77800765324174041822019-07-03T04:03:00.000-07:002019-07-03T04:03:00.394-07:00Watson is starting to sound much more natural<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0cRA0jznxluP1BLU-xIPP6GCEeNbDDj6hQ9SKggbVDpexGcyEsdvY32VRnayd2j10EIQpVtu4I2CKUMSP0YTrSciX8Imp0_a0ElrMdlQNXhThkhu7dcRDNHrstdOtpCzCnVZTzquahPk/s1600/tts-logo.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="698" data-original-width="697" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0cRA0jznxluP1BLU-xIPP6GCEeNbDDj6hQ9SKggbVDpexGcyEsdvY32VRnayd2j10EIQpVtu4I2CKUMSP0YTrSciX8Imp0_a0ElrMdlQNXhThkhu7dcRDNHrstdOtpCzCnVZTzquahPk/s200/tts-logo.png" width="199" /></a></div>
IBM has recently implemented a very significant change in the technology that they use for speech synthesis.<br />
<br />
To simplify, the traditional technology involved splitting up the training audio in to chunks of roughly half a <a href="https://en.wikipedia.org/wiki/Phoneme">phoneme</a> and when given a snippet of speech to synthesise it will pick the most suitable chunks of to use and combine. Sometimes it will be lucky and it will find a large part of the desired speech already in the training corpus and in this case it can generate a very realistic output (because it is essentially replaying a recorded sample). However, more often Watson will need to combine chunks from different utterances in the training data. While there are techniques to try and seamlessly fuse the different chunks together, users frequently complain that they can hear a <i>choppiness</i> and the voice sounds more robotic than human.<br />
<br />
The newly released technology generates the synthesised speech from scratch rather than leveraging recorded chunks of speech. It makes use of three different Deep Neural Networks (DNNs) that look after prosody, acoustic features and voice signal creation. The result is a much more natural sounding voice. Another advantage is that it is much easier to adapt the engine to a new voice because the amount of speech we require from the actor is much less (since we don't need a large corpus to pick samples from).<br />
<br />
You can read an academic description of the research <a href="https://arxiv.org/pdf/1905.00590.pdf">here</a> and a more end user based description <a href="https://medium.com/ibm-watson/ibm-watson-text-to-speech-neural-voices-added-to-service-e562106ff9c7">here</a>.<br />
<br />
Most users agree that this new technology sounds much better. You can try it out for yourself <a href="https://text-to-speech-demo.ng.bluemix.net/">here</a> at the normal Watson TTS demo page. When you select a voice to use, the ones with this new technology are identified by having 'dnn technology' written after their voice. I am sure that you will agree that these sound better than the traditional voices (which are still available).Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-85973247673975814422019-02-12T07:30:00.000-08:002019-02-12T07:30:48.020-08:00Matching Only on the Number of Digits you WantFrequently in Watson you will have two entities that both involve numbers. Say a birth month is two digits long and a birth year four. A problem can arise where the shorter number is found in the longer number because there are two digits inside the four digits.
<p>
Month Entity
<p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBYRYMuRkQcM_tOZMux2szmyU_WGu0UYvtZupT3XgwqS2vB_l3KNug4vwVMbRSXQsSyqELjJ1fTBTV0egFvlFMXTau1dwniQJ8tM5GsXukeRTgcVTXF1OEl1P7oCqdix_qQ4tPKestAAw/s1600/Month.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBYRYMuRkQcM_tOZMux2szmyU_WGu0UYvtZupT3XgwqS2vB_l3KNug4vwVMbRSXQsSyqELjJ1fTBTV0egFvlFMXTau1dwniQJ8tM5GsXukeRTgcVTXF1OEl1P7oCqdix_qQ4tPKestAAw/s320/Month.png" width="320" height="101" data-original-width="786" data-original-height="247" /></a>
<p>
Year Entity
<p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8VOFGVeXdDiL2q9ndDmMCeNh-0uJV-hmyfTyEgPkH7fSQK8u-J-io35tci7IYqCS0Ck5Fo27W_qDusF7KzhSk6D2wDOre8DLe-p6yJ85uvsGJ8C8-1ahDaiEta3TQ3UQf_5VTAGzCgos/s1600/year.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8VOFGVeXdDiL2q9ndDmMCeNh-0uJV-hmyfTyEgPkH7fSQK8u-J-io35tci7IYqCS0Ck5Fo27W_qDusF7KzhSk6D2wDOre8DLe-p6yJ85uvsGJ8C8-1ahDaiEta3TQ3UQf_5VTAGzCgos/s320/year.png" width="320" height="38" data-original-width="863" data-original-height="103" /></a>
<p>
But when Year is given Month is found.
<p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhtLo1VSfgaBzSLhHGP_TIfSrWSasBEup8KNQKGncOQSQhppmMoIYgfmrnvw6Zh6vwul6T81Yv9aJhI3NL_9Mq3ccVIjwwuLZP6Eos88Y1Pu4UxwsIypn-C45yYdl12Mzue4Ay9n7MzQM/s1600/catch.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhtLo1VSfgaBzSLhHGP_TIfSrWSasBEup8KNQKGncOQSQhppmMoIYgfmrnvw6Zh6vwul6T81Yv9aJhI3NL_9Mq3ccVIjwwuLZP6Eos88Y1Pu4UxwsIypn-C45yYdl12Mzue4Ay9n7MzQM/s320/catch.png" width="320" height="220" data-original-width="446" data-original-height="307" /></a>
<p>
<p>
A way around this is to use the \b word boundary regular expression to say I only want numbers if there are spaces or words around the digits.
<p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiD8nyiXEO2ITCu6XRAfODil57AvWroN_HZxq98ZcCAxObagy4Pi3OQ99UIQU1yBtAgfHAeJzK05rUbFg-zRdRP8AcIOYDtg1uqAT8cYI3H5hxDUekP0SNPF9Yc0Tto-UfsB674uWrUYMY/s1600/month2.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiD8nyiXEO2ITCu6XRAfODil57AvWroN_HZxq98ZcCAxObagy4Pi3OQ99UIQU1yBtAgfHAeJzK05rUbFg-zRdRP8AcIOYDtg1uqAT8cYI3H5hxDUekP0SNPF9Yc0Tto-UfsB674uWrUYMY/s320/month2.png" width="320" height="38" data-original-width="836" data-original-height="100" /></a>
<p>
and now it works. The \b is not included in the captured entity just the number which is handy
<p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlj37ddEZjen4ZoGGkbMk_KWnjAcS-Kt1wOrU6DdJOyfOJRkslSOhdEYxIusIA-MaluJOhIRBm6Q_Ld7-n-0fyj5AEF8TjorWmKpO84BSYqAlzXPVgv5hx1gaoFYx1NvwaEE9DGcdFcxI/s1600/nowWorks.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlj37ddEZjen4ZoGGkbMk_KWnjAcS-Kt1wOrU6DdJOyfOJRkslSOhdEYxIusIA-MaluJOhIRBm6Q_Ld7-n-0fyj5AEF8TjorWmKpO84BSYqAlzXPVgv5hx1gaoFYx1NvwaEE9DGcdFcxI/s320/nowWorks.png" width="320" height="178" data-original-width="431" data-original-height="240" /></a>
<p>
I have tested this in Chinese where they do not use spaces and it also works. Which is great.Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-20434544184880869742018-11-28T09:53:00.000-08:002018-11-28T09:53:16.818-08:00Matching patterns and getting their values in Watson Assistant/Conversation<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwIVeJMKw0Clh5_mIwe76JsYyJnkMGTLTViMvbsB658hguhrKR9lGuKbPfVMv2v6GHLSX-NUvTFTiZeksYeO3UtSvfnJv4BYwj8ynYHZDusLf0FM948x5Jmjp7_KGbXKlRxcBSsp1MmI8/s1600/watson-assitant.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="354" data-original-width="354" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwIVeJMKw0Clh5_mIwe76JsYyJnkMGTLTViMvbsB658hguhrKR9lGuKbPfVMv2v6GHLSX-NUvTFTiZeksYeO3UtSvfnJv4BYwj8ynYHZDusLf0FM948x5Jmjp7_KGbXKlRxcBSsp1MmI8/s200/watson-assitant.png" width="200" /></a></div>
When <a href="https://www.ibm.com/watson/ai-assistant/">IBM Watson Assistant</a> (formerly known as Watson Conversation) is deciding how to respond to a user's utterance it is vital that it correctly identifies the intent (what the user wants to do) and the entities (what are the things involved in the intent). For example, if the user says "I want to buy a book" - the intent would be <i>#MakePurchase</i> and the entity<i> @ItemOfInterest</i> would have a value of "book".<br />
<br />
In earlier releases of Watson Assistant, the only way to specify possible entity values was either by manually specifying a list of possible values or else by selecting one of the predefined system entities such as <i>@sys-date</i>. Sometimes this works quite well, but other times (e.g. when you are expecting an email address or an account number) is not feasible to list all of the possible values that people might enter.<br />
<br />
Luckily, the latest version of the Watson Assistant service allows you to <a href="https://console.bluemix.net/docs/services/conversation/entities.html">specify allowable entity values with a regular expression</a>. Unfortunately, people sometimes find it hard to retrieve the matched value from a pattern match. If you are not careful you will be told that an email address was specified and not what exact email address was given. Therefore this blog post works through a very simple conversational design to explain what you need to do.<br />
<br />
First off, you define an intent. We will call our intent <i>#sendMessage</i> and we give Watson a few examples of what the user might say when they want to send a message.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEho5KuyWOAgDCD3xdi8ZI0rBlTOzwcGMdKXaQgt6N4u1jsK_dxtdMouXF3Mh3bSmIJQxbMHigqyeSgDbFfxKZ8SPif0rqtb3cIdtXpSQ_XNwgp-Yht4RRbR3YHEXr7fScLTkf8kZ804qVE/s1600/sendMessageIntent.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="591" data-original-width="407" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEho5KuyWOAgDCD3xdi8ZI0rBlTOzwcGMdKXaQgt6N4u1jsK_dxtdMouXF3Mh3bSmIJQxbMHigqyeSgDbFfxKZ8SPif0rqtb3cIdtXpSQ_XNwgp-Yht4RRbR3YHEXr7fScLTkf8kZ804qVE/s400/sendMessageIntent.png" width="275" /></a></div>
<br />
Then we create a <i>@contact_info</i> entity which we expect users to specify when they are sending a message. To complete this entity, the user types a message indicating that they want to send a message. We expect that the message will also contain details of where to send the message, either an email address or a phone number (the phone number can be specified <a href="https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch04s02.html">in US style</a> or in the <a href="https://en.wikipedia.org/wiki/E.164">e164 standard</a> common in other parts of the world).<br />
<br />
This picture shows how the entity definition will look. Don't worry if you can't read the regular expressions in the screenshots, you can <a href="https://ibm.ent.box.com/s/ceil7c1q46le1kmqvo2pt52p8wv60mll">download the workspace design</a>.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhg_jR1ktrJSYMTZsNw9JgfAsvhvjo7g3qgAKvtgP4nQJUman1gTs9alfSn5ODo9bByFzqzIyh86GHjkULmf5Lc9106cDMV9OM90i_6yGHYScStk3i1oKJvIXYjNGlgQehN4qkO0GDXMic/s1600/contact_info_entity.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="546" data-original-width="892" height="242" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhg_jR1ktrJSYMTZsNw9JgfAsvhvjo7g3qgAKvtgP4nQJUman1gTs9alfSn5ODo9bByFzqzIyh86GHjkULmf5Lc9106cDMV9OM90i_6yGHYScStk3i1oKJvIXYjNGlgQehN4qkO0GDXMic/s400/contact_info_entity.png" width="400" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<br />
Now you need to insert a dialog node to handle requests to send messages. We create a node in our dialog flow which is triggered when Watson detects that the user's intention is to send a message. We know that it is necessary to have contact information to send a message, so if the user didn't supply this we will prompt them.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWae3md8WTMINvxK8cPWBAEiQLdUPN4Z6AsR4pWaz1x3ytwfq2PQS_ze1_1jlgMpWBjkQpo-xG69hV2imIPtC4DYw6QPtl8BJd1XW4p9u2RpW66zz21QxsccUXW3MDi9ZXnhcoyNwp4nI/s1600/send-message-dialog.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1272" data-original-width="1564" height="325" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWae3md8WTMINvxK8cPWBAEiQLdUPN4Z6AsR4pWaz1x3ytwfq2PQS_ze1_1jlgMpWBjkQpo-xG69hV2imIPtC4DYw6QPtl8BJd1XW4p9u2RpW66zz21QxsccUXW3MDi9ZXnhcoyNwp4nI/s400/send-message-dialog.png" width="400" /></a></div>
<br />
<br />
Then we need sub-nodes which deal with sending either emails or phone messages. We select which to activate depending the value of the <i>@contact_info</i> variable, which will be either <i>email</i>, <i>us_phone_num</i> or <i>e164_phone_num</i>.<br />
<br />
When sending a message, it is not enough to know that the user gave us an email address - we need to know the exact email address given. To do that, we need to define a variable whose value will be specified as <span style="color: blue;"><span style="color: blue;">"</span><i style="color: blue;"><? @contact_info.literal ?>"</i>. The screen shot below shows the dialog node for sending a phone message.</span><br />
<div>
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5lT8Vd8sl6iw55jvZUSNtCWhWHiPjvKWFgI45HdCAJEdVPkS0OiYv14XpjaLqPKEJllfEbG7WV-hb2wRZ2knRO2cxMH5TThTsExeGT5e9DA0eyXJjCW6nqUnukSpqpuXLWqeGPJoMFaY/s1600/send-text.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1192" data-original-width="1492" height="255" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5lT8Vd8sl6iw55jvZUSNtCWhWHiPjvKWFgI45HdCAJEdVPkS0OiYv14XpjaLqPKEJllfEbG7WV-hb2wRZ2knRO2cxMH5TThTsExeGT5e9DA0eyXJjCW6nqUnukSpqpuXLWqeGPJoMFaY/s320/send-text.png" width="320" /></a></div>
<br />
<br />
This is the end of our very simple BOT. If you want to see this in action, download the design file <a href="https://ibm.box.com/s/ceil7c1q46le1kmqvo2pt52p8wv60mll">here</a> and import it into your own Watson Assistant instance. Here is a screenshot of what I saw when I clicked on the <i>"Try it out"</i> button to see the bot in action.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqJwdSj0YbHd-YqJ4ko_b3oOI1GI7pev-4jWqYieDC1WJ7CzhBA8OG1AXbACzViXZIsMcKrYD2IL7ZPVOI4_Chj-c6SlDHjF4-s9WV7nUhv17gO2cMq3Pfvcr1gGxHWGxJBUg3eroGucw/s1600/sendMessageTry.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="513" data-original-width="464" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqJwdSj0YbHd-YqJ4ko_b3oOI1GI7pev-4jWqYieDC1WJ7CzhBA8OG1AXbACzViXZIsMcKrYD2IL7ZPVOI4_Chj-c6SlDHjF4-s9WV7nUhv17gO2cMq3Pfvcr1gGxHWGxJBUg3eroGucw/s320/sendMessageTry.png" width="289" /></a></div>
<br />
In summary, regular expression entities can be really useful, so long as you remember to use the @entity_name.literal syntax to get the actual content that was matched rather than simply which rule was fire.<br />
<br />Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-4223414564865307402018-09-14T06:08:00.000-07:002018-09-17T03:43:28.491-07:00Connecting IBM Watson Speech services to the public phone systemMany use cases for <a href="https://www.ibm.com/watson/">IBM Watson</a> speech services involve connecting phone calls. This can be tricky so I decided that it might be useful to publish a sample which shows such a connection in action. This application is very simple, it uses Speech to Text (STT) to understand what the caller says and when the caller pauses it uses Text to Speech (TTS) to read it back to them. This simple application can easily be used as a starting point for building a more complex application which does more with the received speech.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-j5jlcPmqgr4QVZrnb928P2yBuZDczKdKHlcBDd71yDpp4i1QHtwwmEw_M_LNWtEtu3bH-a2IAIurtbBcCHipaLa30Lyeb7JV0xGSgbytonHJ0SvfH_Q9ZVLc8Cmo8fUy5y4sN1luUOw/s1600/voice-flow.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="222" data-original-width="182" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-j5jlcPmqgr4QVZrnb928P2yBuZDczKdKHlcBDd71yDpp4i1QHtwwmEw_M_LNWtEtu3bH-a2IAIurtbBcCHipaLa30Lyeb7JV0xGSgbytonHJ0SvfH_Q9ZVLc8Cmo8fUy5y4sN1luUOw/s1600/voice-flow.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Flow Diagram</td></tr>
</tbody></table>
<br />
<br class="Apple-interchange-newline" />
I chose to use the <a href="https://www.nexmo.com/">NEXMO</a> service because it is the easiest way to connect a phone call to a websocket. You can visit their <a href="https://developer.nexmo.com/">documentation site</a> if you want to learn details of how this works. The short summary is that it acts as a broker between the phone system and a web application of your choice. You need to provide two URLs that define the interface. Firstly nexmo will do a GET on the '/answer" URL every time a call comes in to the number you configure - the way your application handles this request is the key part of the application experience. Secondly nexmo will do a POST to your '/events' URL anytime anything happens on your number (e.g. a call comes in or the person hangs up) in our case we don't do anything interesting with these except to write them to the log for debugging purposes.<br />
<br class="Apple-interchange-newline" />
To get this working for yourself, the first thing you need to do is deploy my sample application. You can get the code from this <a href="https://github.com/bodonova/phone-stt/">GIT repository</a>. Before you deploy it to <a href="https://console.ng.bluemix.net/">IBM Bluemix</a>, you need to edit the <a href="http://manifest.yml/">manifest.yml</a> file and choose a unique URL for your instance of the application. You also need to create instances of the IBM Watson STT and TTS services and bind them to your application.<br />
<br />
Next you need to configure nexmo to connect to your application. You need to login to the nexmo website and then click on the 'Voice' menu on the left side and then '<a href="https://dashboard.nexmo.com/voice/create-application">Create Application</a>. This pops up a form where you can enter details of the /event and /answer URLs for the web application you just deployed. After you fill in this form, you will get an nexmo application id.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZozgA2ld7Qcov-nvZEMQc781-AqUdi6WLbygXZJpVtsaXPM4tt1T61M2MaS3ldQipEtJf0LnQoRYKSMmFFimR4bTkemoEX2PETUYFxEATJpZsZ8GWFmNekE8TQViyTo3Wrejnw0lNLUo/s1600/phone-app.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="378" data-original-width="635" height="238" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZozgA2ld7Qcov-nvZEMQc781-AqUdi6WLbygXZJpVtsaXPM4tt1T61M2MaS3ldQipEtJf0LnQoRYKSMmFFimR4bTkemoEX2PETUYFxEATJpZsZ8GWFmNekE8TQViyTo3Wrejnw0lNLUo/s400/phone-app.png" width="400" /></a></div>
<br />
Unfortunately connecting to the phone system costs money. Nexmo charges different amounts of money for numbers depending upon what country they are associated with. In my case I bought the number +35315134721 which is a local number in Dublin, Ireland. This costs me €2.50 per month so I might not leave it live too long or maybe swap for a US based number at a reduced cost of US$0.67 per month.<br />
<br />
Once you get out your credit card and buy a number, you must tell nexmo which application id you want to associate with the number. Visit the <a href="https://dashboard.nexmo.com/your-numbers">your numbers</a> page and enter the details (like you see below).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFuWd-sQfBhHyQdCvB3f1L2xFwTqq8jDKdx0e2Z2dhOvMDWPOHQFPlkWigKr5Ql2MXuyFyWDpINbq6X3mAw3-1IlBsupveV_hZsYaFPmCZO4B0KXBVpmhTfFO0lqEanInlMAhOkBzshfA/s1600/buy-number-pic.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="562" data-original-width="1242" height="144" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjFuWd-sQfBhHyQdCvB3f1L2xFwTqq8jDKdx0e2Z2dhOvMDWPOHQFPlkWigKr5Ql2MXuyFyWDpINbq6X3mAw3-1IlBsupveV_hZsYaFPmCZO4B0KXBVpmhTfFO0lqEanInlMAhOkBzshfA/s320/buy-number-pic.png" width="320" /></a></div>
<br />
<br />
Having done this, you can now ring the number and see it in action. When we receive a call, we open a websocket interface to the STT service and start echoing all audio received from the phone line to the STT service. Although the TTS service supports a websocket interfacer, we don't use the websocket interface because the chunks of audio data from the TTS service won't necessarily be returned evenly spaced the nexmo service will produce crackly audio output. Instead we use the REST interface and we write the returned audio into a temporary file before streaming it back into the phone call as a smooth rate.<br />
<br />
The bulk of the code is in a file named <i><a href="http://index.js/">index.js</a></i> and it is fairly well explained in the comments, but here are a few more explanatory notes:<br />
<br />
<ul>
<li>The first 70 lines or so are boilerplate code which should be familiar to anyone who has experience of deploying node.js application to BlueMix. First we import the required libraries that we use and then we try and figure out the details of the Watson service instances that we are using. If running on the cloud, this will be parsed from the environment variables. However, if you want to run it locally, you will need a file named <i>vcap-local.json</i> that contains the same information. I have included a file named <i>vcap-local-sample.json</i> in the repository to show you the required structure of the file.</li>
<li>Next comes a function named <i>tts_stream</i> which acts as an interface to the TTS service. It takes two parameters, the <i>text</i> to synthesise and the <i>socket</i> on which to play the result. We use the REST interface instead of opening a websocket to the TTS service (like we do with the STT service). The reason for this choice is that it results in crackly audio as the audio chunks coming back from the TTS service are not evenly spaced. The way it works is that it saves the audio to a temporary file and then pipes the file smoothly into the nexmo socket before deleting the temporary file. This approach introduces a slight delay because we need to wait for the entire response to be synthesised before we start playing. However, the problem is not as bad as you might think because a 15 second audio response might get sent back in under a second.</li>
<li>Next comes the two functions which respond to the /events and /answer URLS. As mentioned earlier the /event handler is very simple because it just echos the POST data into the log. The /answer function is surprisingly simple also. Firstly it creates a websocket and then it sends a specially formatted message back to nexmo to tell it you want to connect the incoming phone call into the new websocket.</li>
<li>The real meat of the code is in the on connect method which we associate with the websocket that we created.</li>
<ul>
<li>The first thing we do is stream audio from a file named <i>greeting.wav</i> which explains to the user what to do. While this message is helpful to the user, it also gives the application some breathing room because it might take some time to initialise the various services and the greeting will stop the user talking before we are ready.</li>
<li>Next we create a websocket <i>stt_ws</i> which is connected to the Watson STT service. </li>
<ul>
<li>As soon as the connection is established, we send a special JSON message to the service to let it know what type of audio we will send and what features we want to enable.</li>
<li>When the connection to STT is started, a special first message is sent back saying that it is ready to receive audio. We use a boolean variable <i>stt_connected</i> to record whether or not this message is received. This is because attempting to send audio data to the websocket before it is ready will cause errors.</li>
<li>When starting the STT service, we specify that we would like to receive <i>interim results</i> i.e. when it is transcribing some audio and it thinks it knows what was said, but it does not yet consider the results to be final (because it might change its mind when it hears next). We do this because we want to speed up responses, but we don't want to echo back a transcription which might later be revised. For this reason we check the value of the <i>final</i> variable in the returned JSON and only call the <i>tts_stream</i> function when the results are final.</li>
</ul>
<li>For the nexmo websocket, we simply say that all input received should automatically be echoed to the STT websocket (once we have received confirmation that the STT service link has been initialised properly.</li>
<li>When the nexmo websocket closes we also try to close the STT web socket</li>
</ul>
</ul>
To give credit, I should point out that my starting point was <a href="https://github.com/nexmo-community/ws-phone">this sample from nexmo</a>. I should also point out that the code is currently only able to deal with only one call at a time. It should be possible to solve this problem, but I will leave this as a learning exercise for some reader of the blog.Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-46081564524192642672018-09-07T09:30:00.001-07:002019-04-12T04:26:17.028-07:00Watson Speech to Text supports language and acoustic customizations, but do you need both?<div class="separator" style="clear: both; text-align: right;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYZny67jG7jlyertm_d3fCx4rmK30WKrzFRbDXq7e6N06x8-AAE2M3GXflDv-B0m4bpcLwA-fcQLXKCRjLh69XfgMcF-R7Mifetxhv3LxGuTs-tv9xZkzDb1bRQaXSmqxZhDDR0rTaevQ/s1600/STT-logo-small.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="200" data-original-width="852" height="75" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYZny67jG7jlyertm_d3fCx4rmK30WKrzFRbDXq7e6N06x8-AAE2M3GXflDv-B0m4bpcLwA-fcQLXKCRjLh69XfgMcF-R7Mifetxhv3LxGuTs-tv9xZkzDb1bRQaXSmqxZhDDR0rTaevQ/s320/STT-logo-small.png" width="320" /></a></div>
The <a href="https://www.ibm.com/watson/services/speech-to-text/">Watson Speech to Text service</a> has recently released a feature whereby customers can customize the service so that it works better for their specific domain. These customizations can be in the form of a <i>Customized Language Model</i> and/or an <i>Customized </i><i>Acoustic Model</i>. Customers sometimes get confused by these two different model types and they wonder which they need or if they need both.<br />
<br />
The quick summary is that the language customization model tells Watson how the words spoken in your domain are different from <i>normal</i> English (or whatever other base language you are using). For example you might be transcribing speech where the speakers use a specialised terminology. On the other hand, the acoustic customization model tells Watson that the words spoken in your domain might be spoken quite differently than they were spoken in the corpus initially used to train Watson STT. For example, you might have audio samples where the users are using a strong regional accent.<br />
<br />
Depending upon your domain, you may need both types of customization, but lets look at them in more detail first.<br />
<br />
<h4>
Customized Language Models</h4>
Customized language models tell Watson what words are likely to occur in your domain. For example, when you specify that you wand to use the en-US language, Watson will have a very large (but fixed) list of possible words that can be spoken in US English. However, in your domain, the users might use a specialised vocabulary. The purpose of the customized language model is to teach Watson how the language in your domain is different from <i>normal</i> English.<br />
<br />
The way you build a customized language model is that you provide one or more corpora which are simple text files containing a single utterance per line. It is important to give complete utterances, because accurate speech to text transcription requires that the service knows not only what words might be seen but also in what context the words are likely to occur.<br />
<br />
If you are building a model to be used in transcribing film reviews, your corpus might include words like <i>movie</i>, <i> release</i>, <i> star</i> and <i>blockbuster</i>. These words are already in the Watson standard dictionary, but including them in your model tells Watson that these words are more likely to occur in your domain than normal (which increases the chance that they will be recognised).<br />
<br />
<ul>
<li>You might also include the word <i>umpa lumpas</i> in your corpus since people will be discussing them and you need to tell Watson it is a valid word. Since this word is pronounced like it is written, all you need to do is tell Watson that it is a valid word. </li>
<li>However, if you are interested in Irish movies, it is likely that people will speak about an actress named <i>Ailbhe</i> or <i>Caoimhe</i>. These common Irish forenames wouldn't be in the Watson dictionary, but it is not enough to tell Watson that they exist. You also need to tell Watson that <i>Ailbhe</i> is pronounced like <i>Alva</i> and <i>Caoimhe</i> is pronounced like <i>Keeva</i>.</li>
</ul>
<br />
The building of this customized language model is probably a relatively simple job. Nevertheless, this customization will probably bring about a dramatic reduction in word error rate. If your audio contains examples of people saying words not in the standard Watson dictionary, then you will never transcribe these properly without a customized language model. In addition, when your speakers say words in the Watson dictionary, the language customization model will increase the chances of these being properly transcribed.<br />
<br />
Many users find that the language customization by itself will meet their needs and there is not necessarily any need to combine it with an acoustic model.<br />
<br />
<h4>
Customized Acoustic Models</h4>
Customized acoustic models allow you to tell Watson what words sound like in your domain. For example, if speakers in your region consistently pronounce the word <i>there</i> as if they are saying <i>dare</i> you might need need to build a customized acoustic model to account for this.<br />
<br />
At one level building a customized acoustic model is even easier than building a customized language model. All you need to do is upload between 10 minutes and 50 hours of sample audio which is typical of the type of speech that you will be trying to transcribe with the model that you are building. And then you train the model.<br />
<br />
However, if you read the documentation carefully you will see that they say "you will get especially good results if you train with a language model built from a transcription of the audio". Transcribing 50 hours of speech is a lot of work and so many people ignore this advice. However, I think the advice should read "you are extremely unlikely to get good results unless you train with a language model built from a transcription of the audio". In my experience, training without a language model containing the transcription can very often produce a model whose word error rate (WER) is significantly worse than having no model at all.<br />
<br />
To understand why this is the case, you need to look a little closer at how acoustic model training works. For illustration purposes, assume that the problem you are trying to solve is that the utterance a-b-c is sometimes being erroneously transcribed at x-y-z.<br />
<br />
<ul>
<li>If you train with a language model, Watson will encounter an ambiguous utterance in your training data which it thinks is 70% likely to be x-y-z and 55% likely to be a-b-c. Since your language model doesn't contain x-y-z it will know that it must be a-b-c and it will make adjustments to the neural network to make it more likely that this utterance will be transcribed as a-b-c in the future. Hence, the model gets <b><u>better</u></b>.</li>
<li>On the other hand, if you train without a language model. Watson will encounter an ambiguous utterance in your training data which it thinks is 70% likely to be x-y-z and 55% likely to be a-b-c. Since it has no other information will assume that it must be x-y-z. However, the confidence score is not very high so it will make adjustments to the neural network to make it even more likely that this utterance will be transcribed as x-y-z in the future. Hence, the model gets <b><u>worse</u></b>.</li>
</ul>
<br />
Of course the chance of such an error happening is related to the word error rate. In my experience, users rarely put the effort into building a customized model when the WER is low. Mostly people build customized models when they are seeing very hight WER and hence they often see carelessly built acoustic models making the problem even worse.<br />
<br />
Another problem people encounter is building an acoustic model from speech which is not typical of their domain. For example, user might be tempted to get a single actor to read out their entire script and get then to record the samples in a recording studio with a good microphone. When their application goes live they might have to deal with audio recorded on poor phone lines, in a noisy environment by people with different regional accents.<br />
<br />
<h3>
Summary</h3>
Language and acoustic customizations serve a different purpose - the first deals with non-standard vocabulary while the other deals with non-standard speech sounds. It is possible that you can build a customized language model very easily and this may be enough for your domain. An acoustic model can improve your WER even further, but you should be careful to ensure you build a good one. In particular you should use transcribed data rather than just collecting random samples.Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-39969592579352235452018-07-23T07:46:00.000-07:002018-07-23T07:46:05.219-07:00Watson Asynchronous Speech to Text<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhY4baepygxjLUpKQtlAmUjJIG7T-oYDYFo_EF03Om8dwMbbJmv2r2hrO3Z4Wk4f8UBTFUyjpJfERsrtgVIxU7rOp-7d4f_3pQto3VdfzS8B2jv-G9f-ZKMsWipqQJZTHUkr4gklGmShaw/s1600/TexttoSpeech.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="200" data-original-width="300" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhY4baepygxjLUpKQtlAmUjJIG7T-oYDYFo_EF03Om8dwMbbJmv2r2hrO3Z4Wk4f8UBTFUyjpJfERsrtgVIxU7rOp-7d4f_3pQto3VdfzS8B2jv-G9f-ZKMsWipqQJZTHUkr4gklGmShaw/s200/TexttoSpeech.png" width="200" /></a></div>
Speech to Text (STT) transcription can take a long time. For this reason the Watson Speech to Text service offers an <a href="https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/curl.html?curl#asynchronous-api" target="_blank">Asynchronous API</a> where the caller doesn't need to wait around while transcription is happening. Instead the person requesting the transcription provides details of a <i>callback server</i> which should be notified when the transcription is complete.<br />
<br />
This programming style is not too difficult when you get used to it, but there are <a href="https://console.bluemix.net/docs/services/speech-to-text/async.html#async" target="_blank">a number of concepts to learn</a> and one thing that slows down people is that you need a fully functional callback server before you can see any of the other components in action.<br />
<br />
In order to help people get started I decided to write a very basic callback server which interacts with Watson STT service, It can help you understand the interaction between the various components and also can also serve as a starting point for a fully functional callback server.<br />
<br />
My callback server is implemented in node.js and because it is small, all of the code is in a single file called <i>app.js</i>. Like all node,js programs, it starts with a list of the dependencies which we will use:<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
span.s1 {color: #931a68}
span.s2 {color: #3933ff}
</style>
<br />
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">const</span> express = require(<span class="s2">'express'</span>);</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">const</span> crypto = require(<span class="s2">'crypto'</span>);</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">const</span> bp = require(<span class="s2">"body-parser"</span>);</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">var</span> jsonParser = bp.json()</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">const</span> app = express();</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">const</span> port = process.env.PORT || 3000;</span></div>
</blockquote>
Next we define a variable called <i>secret_key</i>. For security reasons, you probably don't want any random person to be able to send notifications to your callback server. Therefore the Watson STT asynchronous API allows you to specify a secret key that should be used to sign all requests from the Watson STT service to your callback server. A secret which is published in a blog post is not really a secret so you should edit this variable to some secret value which is unique to your deployment. If you don't want to use this security feature, just set the variable value to null.<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #4e9072}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #3933ff}
span.s1 {text-decoration: underline}
span.s2 {color: #931a68}
span.s3 {color: #000000}
</style>
<br />
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">// <span class="s1">var</span> secret_key = null</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s2">var</span><span class="s3"> secret_key = </span>'my_secret_key'<span class="s3">;</span></span></div>
</blockquote>
This callback server doesn't do much other than write messages in the log to help you understand the flow of messages to and from your callback server. Therefore the <i>log_request()</i> function does that key task for each request.<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; min-height: 15.0px}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #4e9072}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
span.s1 {color: #931a68}
span.s2 {color: #3933ff}
span.s3 {text-decoration: underline}
</style>
<br />
<div class="p1">
<br /></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">// record details of the request in the log (for debugging purposes)</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">function</span> log_request (request) {</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s2">'verb='</span>+request.method);<span class="s3">;</span></span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s2">'url='</span>+ request.originalUrl);</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s2">"Query: "</span>+JSON.stringify(request.query));</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s2">"Body: "</span>+JSON.stringify(request.body));</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s2">"Headers: "</span>+JSON.stringify(request.headers));</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">}</span></div>
</blockquote>
The only thing about this server which is moderately complex is the way it handles signatures. The following function checks whether or not the request contains a valid signature. If the <i>secret_key</i> variable is set to null then no checking is done. When the signature is not valid, it puts a message in the log responds telling you what the signature should have contained.This behaviour is intended to be helpful for developers debugging interactions, but you would probably want to turn it off for production systems because it would be helpful for hackers.<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #4e9072}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; min-height: 15.0px}
p.p4 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #3933ff}
span.s1 {color: #931a68}
span.s2 {color: #000000}
span.s3 {color: #3933ff}
span.s4 {text-decoration: underline}
</style>
<br />
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace;">/<span style="font-size: small;">/ check if the signature is valid</span></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s1">function</span> check_signature(request, in_text) {</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s2"> </span>// check the request has a signature if we are configured to expect one</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">if</span> (secret_key) {</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">var</span> this_signature = request.get(<span class="s3">'x-callback-signature'</span>);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">if</span> (!this_signature) {</span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s2"><span class="Apple-converted-space"> </span>console.log(</span>"No signature provided despite the fact that this server expects one"<span class="s2">);</span></span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s2"> </span><span class="s1">throw</span><span class="s2"> </span><span class="s1">new</span><span class="s2"> Error(</span>"No signature provided despite the fact that this server expects one"<span class="s2">);</span></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>} <span class="s1">else</span> {</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s3">"Signature: "</span>+this_signature);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>}</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s2"> </span>// Calculate what we thing the signature should be to make sure it matches</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">var</span> hmac = crypto.createHmac(<span class="s3">'sha1'</span>, secret_key);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>hmac.update(in_text);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>hmac.end();</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">var</span> hout = hmac.read();</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">var</span> expected_signature = hout.toString(<span class="s3">'base64'</span>);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s3">"Expected signature: "</span>+expected_signature);</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">if</span> (this_signature !=<span class="s4"> </span>expected_signature) {</span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s2"><span class="Apple-converted-space"> </span>err_str = </span>"Actual signature \""<span class="s2">+this_signature+</span>"\" does not match what we expected \""<span class="s2">+expected_signature+</span>"\""<span class="s2">;</span></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(err_str);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">throw</span> <span class="s1">new</span> Error(err_str);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>}</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>}<span class="Apple-converted-space"> </span></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">}</span></div>
</blockquote>
The server needs to handle POST requests coming from the Watson STT server when the status of any transcription service changes. All we do is log the request for debugging purposes. If the signature matches the body of the POST, we give a status of 200 and respond with OK. Obviously a production server would be expected to do something more useful.<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #4e9072}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; min-height: 15.0px}
p.p4 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #3933ff}
span.s1 {color: #931a68}
span.s2 {color: #000000}
span.s3 {color: #3933ff}
span.s4 {text-decoration: underline}
</style>
<br />
<div class="p1">
/<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">/ Handle POST requests with STT job status notification</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">app.post(<span class="s1">'/results'</span>, jsonParser, (request, response) => {</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>log_request (request);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s2">if</span> (!request.body) {</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s3"><span class="Apple-converted-space"> </span></span><span class="s2">var</span><span class="s3"> err_text = </span>'Invalid POST request with no body'<span class="s3">;</span></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(err_text);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>response.status(400);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>response.status(err_text);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>}</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>check_signature(request, JSON.stringify(request.body));</span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s3"><span class="Apple-converted-space"> </span></span>// for now just record the event in the log</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>console.log(<span class="s1">'Event id:'</span>+request.body.id+<span class="s1">' event:'</span>+request.body.event+<span class="s1">' user_token:'</span>+request.body.id);</span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s3"><span class="Apple-converted-space"> </span></span>// The <span class="s4">spec</span> is not clear about what we should respond to just say OK</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>response.type(<span class="s1">'text/plain'</span>);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>response.send(<span class="s1">"OK"</span>);</span></div>
<div class="p1">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #4e9072}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #3933ff}
p.p4 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; min-height: 15.0px}
span.s1 {color: #3933ff}
span.s2 {color: #931a68}
span.s3 {color: #000000}
span.s4 {text-decoration: underline}
</style>
</div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">})</span></div>
</blockquote>
When registering your callback server, Watson STT issues a GET request with a random <i>challenge_string</i> to see if your server is up an running. If the signature on the request matches the content of the challenge_string then we simply echo back the challenge_string to let the Watson server know we are functioning OK. If the signature is wrong we issue an error response and the registration of the callback server will fail.<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #4e9072}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; min-height: 15.0px}
p.p4 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #3933ff}
span.s1 {color: #3933ff}
span.s2 {color: #931a68}
span.s3 {color: #000000}
</style>
<br />
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">// Deal with the initial request checking if this is a valid STT callback URL</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">app.get(<span class="s1">'/results'</span>, (request, response) => {</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>log_request (request);</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s2">if</span> (!request.query.challenge_string) {</span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s3"><span class="Apple-converted-space"> </span>console.log(</span>"No challenge_string specified in GET request"<span class="s3">);</span></span></div>
<div class="p4">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s3"><span class="Apple-converted-space"> </span></span><span class="s2">throw</span><span class="s3"> </span><span class="s2">new</span><span class="s3"> Error(</span>"No challenge_string specified in GET request"<span class="s3">);</span></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>}</span></div>
<div class="p3">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><br /></span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>check_signature(request, request.query.challenge_string);</span></div>
<div class="p3">
<br /></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>response.type(<span class="s1">'text/plain'</span>);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>response.send(request.query.challenge_string);</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">})</span></div>
</blockquote>
Finally the app starts listening for incoming requests:<br />
<blockquote class="tr_bq">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Monaco; color: #295e99}
span.s1 {color: #931a68}
span.s2 {color: #3933ff}
span.s3 {color: #000000}
</style>
<br />
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">app.listen(port, (err) => {</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">if</span> (err) {</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span><span class="s1">return</span> console.log(<span class="s2">'something bad happened'</span>, err);</span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="Apple-converted-space"> </span>}</span></div>
<div class="p2">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;"><span class="s3"><span class="Apple-converted-space"> </span>console.log(</span>`server is listening on ${port}`<span class="s3">);</span></span></div>
<div class="p1">
<span style="font-family: "courier new" , "courier" , monospace; font-size: small;">})</span></div>
</blockquote>
I have an instance of this callback processor running at <a href="https://stt-async.eu-gb.mybluemix.net/results">https://stt-async.eu-gb.mybluemix.net/results</a> but it is not really any use to you since you won't be able to see the console log messages. You can also download the complete sample for code from <a href="https://github.com/bodonova/stt-async" target="_blank">GitHub</a> and host it either in <a href="http://https%3B//bluemix.net">BlueMix</a> or the hosting platform of your choice,<br />
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica; min-height: 14.0px}
p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Courier; color: #0059cd}
p.p3 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Courier; color: #002f73}
p.p4 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Courier; color: #3e3e3e}
p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.0px Courier; color: #7576b1}
span.s1 {color: #e93730}
span.s2 {color: #3e3e3e}
span.s3 {color: #002f73}
span.s4 {color: #0059cd}
span.s5 {color: #7576b1}
table.t1 {border-collapse: collapse}
td.td1 {width: 677.2px; background-color: #ffffff; border-style: solid; border-width: 1.0px 1.0px 1.0px 1.0px; border-color: #cbcbcb #cbcbcb #cbcbcb #cbcbcb; padding: 0.0px 5.0px 0.0px 5.0px}
</style>Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com6tag:blogger.com,1999:blog-3777125262143150721.post-68803717719647907332018-04-05T14:59:00.001-07:002018-04-05T14:59:38.798-07:00Naming IntentsHow should you name intents? Heres one way and an explanation as to why.
<p>
In this post we <a href="https://watson-tricks.blogspot.ie/2018/04/clustering-questions-part-2-intentions.html">described clustering a topic into intents</a>. The naming scheme I used was <b>TopicIntent</b>.
<p>
When you go to improve accuracy you will merge and split intents. You tend not to do this outside Topics. I find that if you have the topic name in the intent when you do these changes it is easier to keep your brain in one context.
<p>
<b>Cluster Topics</b>
<p>
"Happy families are all alike; every unhappy family is unhappy in its own way." the <a href="https://en.wikipedia.org/wiki/Anna_Karenina_principle">Anna Karenina principle</a>
<p>
Some Topics cover loads but you don't really care about the individual intents inside. For example if you have a <b>Complaints</b> topic that could cover all sorts of things people moan about.
<p>
No one wants a message back saying "This robot cares we have lost your bags". A complaint question will have to be passed onto a person. If we can tell that person that we have a complaint they can then decide what to do next. If you do not break down complaint topic into intents though all sorts of questions will be in one intent. It will deal with damage, delays, queues, lost items, dirty conditions etc. This giant varied intent will suck in other questions damaging your overall system accuracy.
<p>
With a varied topic like complaints that your chatbot cannot handle by itself. If you make one giant intent it will damage your overall accuracy. But because complaints tend to be about a few things at once, <i>'The food was terrible and the portions were small'</i>, there is often not one solid intent anyway. By labelling all complaints <b>ComplaintIntent</b> it is possible to ignore the intent part as getting the topic right is good enough.
<p>
In our accuracy tests we can strip the intent part off and say that if we land in <b>Complaint</b> that is good enough. But not create on giant intent that covers too much and that will suck in all other questions.
<p>
This issue of big topic particularly happens with Off Topic topics where questions are out of scope, silly or just cover large areas that you can't really answer.
<p>
There are other ways to label intents. This <b>TopicIntent</b> method is what I use. If you have a different way please mention it in the comments.Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-36995561792275451282018-04-04T12:32:00.001-07:002018-04-05T15:00:27.820-07:00Clustering Questions Part 2: IntentionsOnce you have <a href="https://watson-tricks.blogspot.ie/2018/04/clustering-questions-into-topics.html">divided your questions into <i>Topics</i></a> the next step is to divide them into <i>Intents</i>. This is how I would find the intents inside a topic
<p>
An <b>Intent</b> is a purpose or goal expressed by a user’s input such as finding contact information or booking a trip.
<p>
Imagine you had an airline booking chatbot. And you had these questions in the Topic <b>booking</b>
<p>
There is a dataset of <a href="https://datasets.maluuba.com/Frames/dl">travel questions here</a>
I will take some questions from there and invent some myself
<p>
A booking topic could have
<table style="width:100%">
<tr>
<th>Question</th>
<th>Intent</th>
</tr>
<tr>
<td>I'd like to book a trip to Atlantis from Caprica on May 13</td>
<td>BookTicket</td> </tr>
<tr>
<td>I'd like to book a trip from Chicago to San Diego between Aug 26th and Sept 5th</td>
<td>BookTicket</td> </tr>
<tr>
<td>i wanna go to Kobe whats available?</td>
<td>BookTicket</td> </tr>
<tr>
<td>Can I get information for a trip from Toluca to Paris on August 25th?</td>
<td>BookTicket</td> </tr>
<tr>
<td>I'd like to book a trip to Tel Aviv from Tijuana. I was wondering if there are any packages from August 23rd to 26th</td>
<td>BookTicket</td> </tr>
<tr>
<td>I want to know how far in advance I can book a flight</td>
<td>BookFuture</td> </tr>
<tr>
<td>When do bookings open for 6 months time</td>
<td>BookFuture</td> </tr>
<tr>
<td>I want to get a ticket for my christmas flight home</td>
<td>BookFuture</td> </tr>
<tr>
<td>Can I check my booking?</td>
<td>BookCheck</td> </tr>
<tr>
<td>can i check my booking status</td>
<td>BookCheck</td> </tr>
<tr>
<td>can i check the status of my booking</td>
<td>BookCheck</td> </tr>
<tr>
<td>how do i check the status of my booking</td>
<td>BookCheck</td> </tr>
<tr>
<td>i need to check my booking status</td>
<td>BookCheck</td> </tr>
<tr>
<td>Let me know the status of my Booking</td>
<td>BookCheck</td> </tr>
<tr>
<td>Can I book now pay later</td>
<td>BookPay</td> </tr>
<tr>
<td>How can I pay for a booking?</td>
<td>BookPay</td> </tr>
</table>
<p>
On this the verbs <b>Check</b> and <b>Pay</b> each seem to form an intent.
There is one intent on <b>When</b> bookings can happen.
And a few unknowns that might make more sense when we have more questions later.
<p>
At this stage realise you are going to make mistakes and have to go back over your intents as you learn by doing. Fixing your intents once you have had a first cut at defining I will come back to.
<p>
One could way to find intents in a topic is to look for verbs. Unrelated actions tend to have different verbs. In this case the topic <b>Booking</b> is already a verb and a noun. This is common enough. Dual meanings like this can be a nightmare with Entities but that is another blogpost.
<p>
In this topic something like 'cancel a booking' is likely to be an intention. Here Cancel is the verb and booking as the object of the sentence.
<p>
Other clues to the intention are the <a href="http://brenocon.com/watson_special_issue/02%20question%20analysis.pdf"><b>Lexical Answer Type</b></a>, the subject and the object
The LAT is the type of question. Who questions have different types of answers to When questions. In practise I don't find you commonly use the LAT to define intentions.
<p>
One possible exception to this is definitional questions where users ask "What is a..." for a domain term to be explained. If more than 5% of your questions are definitional you may not have collected representitive questions as manufactured questions by non real users or real users forced to ask questions tend to be definitional. When someone runs out of real questions they will ask '<i>What is a booking</i>'.
<p>
The <b>Subject</b> of the sentence is also rarely useful. Sometimes who is doing an action changes the answer but usually there is a set scheme to buy, book, cancel etc and who is doing it doesn't matter.
<p>
The <b>Object</b> of the sentence is more often useful. Frequently an intention is a combination of the verb and what it is being done to. Whichever one isn't the Topic is usually the intent. Booking might be a topic and various things you do with a booking would be intents.
<p>
In summary go through each topic. If there are verbs shared across questions they might go together in an intent. But you have to use the domain experts knowledge of what questions have the same intention this step cannot be automated.
Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-50423002121848573702018-04-03T12:03:00.003-07:002018-04-03T12:03:46.176-07:00Clustering Questions into TopicsIBM Watson used to claim it took 15 minutes to match up a question with an intent. The technique described here halves that time. Context switching is mentally draining and wastes a lot of time. Concentrating on one part of a job until it is done is much more efficient than switching between tasks.
<p>
In a similar way once we have our questions collected the next task is to divide them into topics. Then these topics will be looked at individually.
<p>
A topic is a category of types of questions people will ask your chatbot.
<p>
In an <b>Airline</b> these might be <i>Checkin, Booking, Airmiles</i>
<p>
In an <b>Insurance company</b> <i>Renewal, Claim, Coverage</i>
<p>
Before looking at the questions try think of 5 topics that might occur in customer questions to your business.
<p>
<b>How many topics?</b>
<p>
Roughly 20. You might have ten or 30. A rule of thumb used in <a href="http://www.cs.haifa.ac.il/~rita/ml_course/lectures/KNN.pdf">K Nearest Neighbour</a> classification is if you have N documents you expect to have Sqrt(N)
clusters. This works out as 44 for 2000 clusters. You won't have 2000 questions at this stage more likely under 1000.
<p>
<b>Can you automate discovering topics</b>
<p>
Yes you can using a KNN algorithm with the number of clusters given above. No you really should not. You learn a hell of a lot clustering 500 questions. You will have to read all these questions eventually anyway
so you might as well learn this stuff now.
<p>
<b>Process of Marking up topics</b>
<p>
Say you have 500 questions in a spreadsheet. What we are trying to do here is mark up a new column 'Topic' that puts each of these questions in a topic.
<p>
Go through your 500 questions. Looking for the 5 topics you listed in question above. You may find that actually what you thought was one cluster is two. Or that a topic you expected is missing
If you are looking for the clothesReturn topic I would search for the key words 'return' and 'bring back'.
I would look for the obvious words in each of the topics I expect.
<p>
Once I had marked up the obvious keywords from my list of 500 that were clothesReturn if I found a new question in that topic I would look for the word it had that showed me it was that topic but was not in my original search list
<p>
<i>Can I exchange a jumper I bought yesterday for a new one</i>
<p>
I would then search for other uses of 'exchange'. It is a word likely to be used in clothesReturn but one I missed earlier.
<p>
If you know the domain roughly half of the questions will be classified by your obvious keywords.
<p>
I would read through the remaining questions with my 5 expected topics in my head. If I see something that is obviously a new topic I add that to the topic list.
<p>
Feel free with marking 5-10% of questions with unknown. these might make more sense when you have more questions or might be part of the long tail, out of scope or off topic that your chatbot will not handle.
<p>
<b>What Next</b>
<p>
Once you have a spreadsheet with a column marked up with the Topic of each question the next step is to find the intent of each question. But now you are reviewing a series of questions in one topic which makes it much easier to concentrate and work in a batch mode.
<p>
I will describe this step of marking up intents in a later blogpostIamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-18956698990786933892018-02-05T06:57:00.003-08:002018-02-05T06:57:47.079-08:00An alternative way of training Watson Discovery Service<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiU5vb97ep7el8ddkCJdkNsxgZOU5FNCaRwxTxe_KQ3U3IJZ1r22cxIo6U0yn5AH5okHave_B8gj145yAS8uah9Wub_RThcuI1hJ6CM6hansJ7lvunam1e0ecjqoY3Gqp6i39vmKIxBTPc/s1600/WDS-purple.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="225" data-original-width="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiU5vb97ep7el8ddkCJdkNsxgZOU5FNCaRwxTxe_KQ3U3IJZ1r22cxIo6U0yn5AH5okHave_B8gj145yAS8uah9Wub_RThcuI1hJ6CM6hansJ7lvunam1e0ecjqoY3Gqp6i39vmKIxBTPc/s1600/WDS-purple.png" /></a></div>
Watson Discovery Service (WDS) provides an excellent natural language query service. This service works well out of the box, but many users like to improve the results for their particular domain by training the service. In order to train the service how to better rank the results of natural language query you need to provide the service with some sample queries and for each query indicate which documents are good results for this query and equally importantly which documents would be a bad result for the query.<br />
<br />
The <a href="https://console.bluemix.net/docs/services/discovery/train-tooling.html#improving-result-relevance-with-the-tooling" target="_blank">standard user interface to the training capability</a> allows you to view the potential results in a browser and then click on a button to indicate if the result is good or bad. Clicking on the results is easy for a small sample of queries, but it quickly becomes tedious. For this reason, many users prefer to use <a href="https://console.bluemix.net/docs/services/discovery/train.html#improving-result-relevance-with-the-api" target="_blank">the API for the training service</a> which gives additional control and capabilities.<br />
<br />
Unfortunately the WDS training service only works well with large amounts of training data and in many cases it is not feasible to collect this volume of training data. Luckily there is an alternative (homegrown) way of training WDS which works significantly better for small amounts of training data. The method (which is known as <i>hinting</i>) is amazingly simple. All you need to do is add a new field to your target documents (e.g. named <i>hints</i>) with the text of the question that you want the document to be selected as an answer. Obviously when you as this question (or a similar question) the natural language query engine will select your target document and rank it highly since it is clearly a good match.<br />
<br />
This alternative training method is sometimes called <i>hinting</i> because you are providing hints to WDS about which questions this document provides and answer. An additional benefit of this training method is that it helps find matches where the question and the answer document don't have any words in common. Whereas, the standard WDS training method only impacts upon the ranking of results so if the answer document you want to be selected is not even in the list of top 100 answers fetched for the query the normal training would not help.<br />
<br />Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-68965416525643405472018-01-30T12:24:00.002-08:002018-01-30T12:27:20.963-08:00Visualizing Chatbot Quality with Swarm PlotWhen you create a chatbot you frequently want to see where it is going wrong so that you can fix problems. When you look at the logs or run tests you get results of the form
<p>
<b>Question, Correct Intent, Returned Intent, Confidence</b>
<p>
Can I update my account settings?,Update,Check,0.332
<p>
I have a demo dataset here you can use to follow along with the code.
<a href="https://drive.google.com/file/d/1qZKatu5S1eyqIETAWJbkDml0aUN1tn37/view?usp=sharing">Swarm csv</a>
<p>
Usually the confusion matrix of which intentions are mixed up with each other can be shown with a heatmap. But an interesting visualisation for this type of data is a <a href="https://seaborn.pydata.org/generated/seaborn.swarmplot.html">swarm plot</a> using the Python seaborn library. There is a nice guide to the <a href="https://elitedatascience.com/python-seaborn-tutorial">seaborn visualization library here</a>
<p>
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;"># Pandas for managing datasets
import pandas as pd
# Matplotlib for additional customization
from matplotlib import pyplot as plt
%matplotlib inline
# Seaborn for plotting and styling
import seaborn as sns
#read in the csv
df = pd.read_csv('swarm.csv', index_col=0, encoding='mac_roman')
df.columns = ['Intent', 'Expected','Confidence']
#draw the swarm chart
plt.figure(figsize=(10,6))
swarm_plot = sns.swarmplot(y='Confidence',
x='Expected',
hue='Intent',
data=df)
plt.legend(bbox_to_anchor=(1, 1), loc=2,title='Got')
plt.title('Swarm Report')
</code></pre>
The graph shows you which intentions are being mixed up and the confidence that your chatbot has in its answers.
<p>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg89rcfFRIO2fNbXdDbGzEgbMq2VaRO_pHnzbryDJmRLKsQKvt76e5T2RRzv1LUtQIY036Sv6bPudiH6VeGoJlvh9K3YDuThAzUijAyoT5ZQxblxKxYCApyEujPhhUSi7DNBIycthhzioA/s1600/swarm.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg89rcfFRIO2fNbXdDbGzEgbMq2VaRO_pHnzbryDJmRLKsQKvt76e5T2RRzv1LUtQIY036Sv6bPudiH6VeGoJlvh9K3YDuThAzUijAyoT5ZQxblxKxYCApyEujPhhUSi7DNBIycthhzioA/s400/swarm.png" width="500" height="275" data-original-width="686" data-original-height="387" /></a>
<p>Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-2867201283899891202018-01-16T02:36:00.000-08:002018-01-16T02:36:57.482-08:00Combining the annotation capabilities of both Watson Knowledge Studio and Watson Discovery ServiceWatson Discovery Service (WDS) provides a capability to automatically annotate the documents being ingested. This capability is available in several languages and it is able to recognize a wide range of entity types commonly found in typical texts written in these languages.<br />
<br />
Unfortunately many users of WDS have to deal with documents which are not typical. For example, they could be dealing with medical documents that contain unusual drug and disease names or they could be dealing with a business domain that has obscure terminology that would not be understood by WDS (or indeed by most speakers of the language in question).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIsGFYRtU-mC3S9YrUuW6_F_cYzPIcL4Zpl-FuOXXtKY_gnSfy37pCA4K18S9CeKB9EHxqPPMpSlCE6OXrymQtKER81618KRWr0U7j-Cyw4fL36fPX3_z9dpbAt_6WKscAGh7bptkYV9U/s1600/WKS_logo.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="292" data-original-width="292" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIsGFYRtU-mC3S9YrUuW6_F_cYzPIcL4Zpl-FuOXXtKY_gnSfy37pCA4K18S9CeKB9EHxqPPMpSlCE6OXrymQtKER81618KRWr0U7j-Cyw4fL36fPX3_z9dpbAt_6WKscAGh7bptkYV9U/s200/WKS_logo.jpg" width="200" /></a></div>
Luckily Watson Knowledge Studio (WKS) is can be used to create a language model that understands the specialized terminology for any domain. However many document collections will contain a mixture of specialized terminology and normal test. By default, when users choose to specify that a customized WKS domain model is to be used instead of the generic WDS model it is as a replacement and none of the normal entities will be annotated by WDS.<br />
<br />
It is not feasible for users to build a complete WKS model that incorporates all of the normal language dictionaries as well as the specialized domain terminology. However, there is a trick which can be used to get WDS to use both the domain specific annotator from WKS and the generic language annotator from WDS.<br />
<br />
Unfurtunately this trick is not possible with the normal WDS UI, but it requires the use of the <a href="https://www.ibm.com/watson/developercloud/discovery/api/v1/" target="_blank">REST API</a> - hopefully you are already familiar with this and you should be able to <a href="https://www.ibm.com/watson/developercloud/discovery/api/v1/#get_configurations" target="_blank">export your configuration</a> to a JSON file. Assuming that you have configured a number of enrichments for the field named "text" you will see that your configuration contains a fragment that looks something like the following:<br />
<br />
<pre> <span style="color: blue;"> "enrichments": [
{
"enrichment": "natural_language_understanding",
"source_field": "text",
"destination_field": "enriched_text",
"options": {
"features": {
"keywords": {},
"entities": {
"model": "a3398f8b-2282-4fdc-b062-227a162dc0eb"
},
"sentiment": {},
"emotion": {},
"categories": {},
"relations": {},
"concepts": {},
"semantic_roles": {}
}
}
}
],</span></pre>
<br />
This fragment means that you have selected a number of different enrichment types to be computed for the text field and the results to be placed in the field named "enriched_text". For most of these enrichments you will use the language model which is provided with the natural language understanding unit that is built into WDS, but for entities it will instead rely upon the WKS model ID "a3398f8b-2282-4fdc-b062-227a162dc0eb".
<br />
<br />
If you want to have the core WDS detected entities as well as the WKS detected ones, you need to define an additional enrichment entry in your configuration file to place these enrichments in a different named field e.g. wds_enriched_text. The fragment of JSON above needs to be replaced with the fragment below and then the new configuration should be <a href="https://www.ibm.com/watson/developercloud/discovery/api/v1/#replace_configuration">uploaded via the API</a>.<br />
<br />
<pre> <span style="color: blue;"> "enrichments": [
{
"enrichment": "natural_language_understanding",
"source_field": "text",
"destination_field": "enriched_text",
"options": {
"features": {
"keywords": {},
"entities": {
"model": "a3398f8b-2282-4fdc-b062-227a162dc0eb"
},
"sentiment": {},
"emotion": {},
"categories": {},
"relations": {},
"concepts": {},
"semantic_roles": {}
}
}
},
{
"enrichment": "natural_language_understanding",
"source_field": "text",
"destination_field": "wds_enriched_text",
"options": {
"features": {
"entities": {}
}
}
}
],</span></pre>
<br />
What this configuration will produce is two different enrichment fields containing the entities detected by WDS and WKS. However, it is likely that you want to have all of the detected entities available in a single field. Luckily this is possible by configuring the collection to merge the two fields during the "Normalize" phase.
Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com4tag:blogger.com,1999:blog-3777125262143150721.post-59494119503624367862017-09-20T21:52:00.000-07:002017-11-08T06:08:34.301-08:00Adding a speech interface to the Watson Conversation Service<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjBqmcUkkGz4g1Q8BemcLdA5GTbH20eE2Dqd705og3O15WFHzAhYMffU2usGhajn95BSEXPtA6eLpQVGma1VnZ8oOrN7tr4lz3KYcvR4RpYU9rRJpo4q5VwIrUBY883QPF1Knj7Lfo_Pg/s1600/TexttoSpeech.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="200" data-original-width="300" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjBqmcUkkGz4g1Q8BemcLdA5GTbH20eE2Dqd705og3O15WFHzAhYMffU2usGhajn95BSEXPtA6eLpQVGma1VnZ8oOrN7tr4lz3KYcvR4RpYU9rRJpo4q5VwIrUBY883QPF1Knj7Lfo_Pg/s200/TexttoSpeech.png" width="200" /></a></div>
The <a href="https://www.ibm.com/watson/services/conversation" target="_blank">IBM Watson Conversation Service</a> does a great job of providing an interface that closely resembles a conversation with a real human being. However, with the advent of products like the <a href="https://en.wikipedia.org/wiki/Amazon_Echo" target="_blank">Amazon Echo</a>, <a href="https://www.microsoft.com/en-gb/windows/cortana" target="_blank">Microsoft Cortana</a> and the <a href="https://en.wikipedia.org/wiki/Google_Home" target="_blank">Google Home</a>, people increasingly prefer to interact with services by speaking rather than typing. Luckily IBM Watson also has <a href="https://www.ibm.com/watson/services/text-to-speech" target="_blank">Text to Speech</a> and <a href="https://www.ibm.com/watson/services/speech-to-text/" target="_blank">Speech to Text</a> services. In this post we show how to hook these services together to provide a unified speech interface to Watson's capabilities.<br />
<br />
In this blog we will build upon the existing <a href="https://speech-to-speech-app.mybluemix.net/" target="_blank">SpeechToSpeech sample</a> which takes text spoken in one language and then leverages Watson's machine translation service to speak it back to you in another language. You can <a href="https://speak-to-watson-app.au-syd.mybluemix.net/" target="_blank">try the application described here on Bluemix</a> or<a href="https://github.com/bodonova/SpeakToWatson" target="_blank"> access the code on GitHub</a> to see how you can customise the code and/or deploy on your own server.<br />
<br />
This application has only one page and it is quite simple from the user's point of view.<br />
<ul>
<li>At the top there is some header text introducing the sample and telling users how to use it. </li>
<li>The sample uses some browser audio interfaces that are only available in recent browser versions. If we detect that these features are not present we put up a message telling the user that they need to choose a more modern browser. Hopefully you won't ever see this message.</li>
<li>In the original sample there are two drop down selection boxes which allow you to specify the source and target language. We removed these drop downs since they are not relevant to our modified use case.</li>
<li>The next block of the UI gives the user a number of different ways to enter speech samples:</li>
<ul>
<li>There is a button <img border="0" data-original-height="70" data-original-width="316" height="20" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbfehNndlIGvclQo5FAAVk3AgTilvIkjrXQ9af8fU_ZobS9gdvC5JRpIx7c1Mcmy4cUsY6haDpNtmX0-ZTTfqFM8DVk68PAntt1vFjp1_-Vz3PxjA-iCGCWA8pi4lGtRxinHQIwqvZrl4/s200/record.png" width="100" /> which allows you to start capturing audio directly from the microphone. Whatever you say will be buffered and then passed directly to the transcription service. While capturing audio, the button changes colour to red and the icon changes <img border="0" data-original-height="58" data-original-width="302" height="20" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiNJoVPAo4v-agGhiNIfUwdHFp4IxxBVHUEUB6PX-RUTsjzgv66t902f_wVHiKHTLBTTe7WTaqfxmTN_Wi6RbClNNfR4R3il-o7gAeuTR_Dk19sjUaU6ETbbsLL-TT9eL3pbS0hYjYxts8/s200/record-ongoing.png" width="100" /> - this is a visual indication that recording is in progress. When you are finished talking, click the button again to stop audio capture.</li>
<li>If are working in a noisy environment or if you don't have a good quality microphone, it might be difficult for you to speak clearly to Watson. To help solve this problem we have provided you with some ample files hosted in the web app. To play one of these samples click on one of the <img border="0" data-original-height="64" data-original-width="184" height="20" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsmMejveeqy28R7KO1vQ_ST-ZWbY10YHtFwFzV91h27GcVkS0SYix9LT3Rpls3Dlr1IzxZYsbZT7ELqZjRIx0SpvC7KZ1LAYm0u1Z9pucrhOox3ubJXah_b1w0zb_AgvXbbQdbPiAMytg/s1600/play.png" width="70" /> buttons to play the associated file and use it as input.</li>
<li>If you have your own recording that you can click on the <img border="0" data-original-height="70" data-original-width="266" height="20" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9vzbWlzlUxns94gA94z9OI6wS4SkzjxZDjJb2X9NYiB73ttGp9UdGdApbq1psGfNaQkUw3ibREP2b11xkFnQqWbweUBNkJkq0jNz5XIMBU6mjR0xbeTVT9fFhWe3wZwab_AsJtYUNRds/s1600/select-file.png" width="100" /> button and select the file containing the audio input that you want to send to the speech-to-text service.</li>
<li>Last, but not least, you can drag and drop an audio file onto the page to have it instantly uploaded</li>
</ul>
<li>The transcribed text is displayed on an input box (so you can see if Watson is hearing properly) and sent to either the translation service (in the original version) or the conversation service in our updated service. If there is a problem with the way your voice is being transcribed, see this <a href="https://watson-tricks.blogspot.co.uk/2017/09/watson-speech-to-text-with-nodejs.html" target="_blank">previous article</a> on how to improve it.</li>
<li>When we get a response from the conversation or translation service we place the received text on an output text box and we also call the text-to-speech service to read out the response and save you the bother of having to read.</li>
</ul>
I know that you want to understand what is going on under the covers so here is a brief overview:<br />
<ul>
<li>The app.js file is the core of the web application. It implements the connections between the front end code that runs in the browser and the various Watson services. This involves establishing 3 back-end REST services. This indirection is needed because you don't want to include your service credentials in the code sent to the browser and because your browser's cross site script protections will prohibit you from making a direct call to the Watson service from your browser. The services are</li>
<ul>
<li>/message - this REST service implements the interface to the Watson Conversation service. Every time we have a text utterance transcribed, we do a POST on this URL with a JSON payload like <span style="color: blue;">{"context":{...},"input":{"text":"<transcribed_text>"}}</span>. The first time we call the service we specify an empty context {} and in each subsequent call we supply the context object that the server sent back to us the last time. This allows the server to keep track of the state of the conversation. <br />Most conversation flows are programmed to give a trite greeting in response to the first message. To avoid spending time on this the client code sends initial blank message when the page loads to get this out of the way.</li>
<li>/synthesize - this REST service use used to convert the response into audio. All that this service does to convert a get on http://localhosts:3000/synthesize?voice=en-US_MichaelVoice&text=Some%20response<a href="http://voice%3Den-us_michaelvoice%26text%3Dsome%20response/" target="_blank">voice=en-US_MichaelVoice&text=Some%20response</a> into a get on the URL <a href="https://watson-api-explorer.mybluemix.net/text-to-speech/api/v1/synthesize?accept=audio%2Fwav&voice=en-US_MichaelVoice&text=Some%20response">https://watson-api-explorer.mybluemix.net/text-to-speech/api/v1/synthesize?accept=audio%2Fwav&voice=en-US_MichaelVoice&text=Some%20response</a> this will return a .wav file with the text "some response" being spoken in US English by the voice "Michael". </li>
<li>/token - the speech to text transcription is an exception to the normal rule that your browser shouldn't connect directly to the Watson service. For performance reasons we chose to use <a href="https://console.bluemix.net/docs/services/speech-to-text/websockets.html#websockets" target="_blank">the websocket interface to the speech to text service</a>. At page load time, the browser will do a GET on this /token REST service and it will respond with a token code that can then be included in the URL used to open the websocket. After this, all sound information captured from the microphone (or read from a sample file) is sent via the websocket directly from the browser to the Watson speech to text service.</li>
</ul>
<li>The index.html file is the UI that the user sees. </li>
<ul>
<li>As well as defining the main UI elements which appear on the page, it also includes main.js which is the client side code that handles all interaction in your browser.</li>
<li>It also includes the <a href="https://jquery.com/" target="_blank">JQuery</a> and <a href="http://getbootstrap.com/" target="_blank">Bootstrap</a> modules. But I won't cover these in detail.</li>
</ul>
<li>You might want to have a closer look at the client side code which is contained in a file public/js/main.js:</li>
<ul>
<li>The first 260 lines of code are concerned with how to capture audio from the client's microphone (if the user allows it - there are tight controls on when/if browser applications are allowed to capture audio). Some of the complexity of this code is due to the different ways that different browsers deal with audio. Hopefully it will become easier in the future. </li>
<li>Regardless of what quality audio your computer is capable of tracking, we down sample it to 16bit, mono at 16 Khz because this is what the speech recognition is expecting.</li>
<li>Next we declare which language model we want to use for speech recognition. We have hardcoded this to a model named "en-GB_BroadbandModel" which is a model tuned to work with high fidelity captures of of speakers of UK English (sadly there is no language model available for Irish English). However, we have left in a few other language models commented out to make it easy for you if you want to change to another language. Consult the <a href="https://console.bluemix.net/docs/services/speech-to-text/input.html#models" target="_blank">Watson documentation</a> for a full list of language models available.</li>
<li>The handleFileUpload function deals with file uploads. Either file uploads which happen as a result of explicitly clicking on the "Select File" button or upload that happen as a result of a drag-and-drop event.</li>
<li>The initSocket function manages with the interface to the websicket that we use to communicate to/from the speech_to_text service. It declares that the showResult function should be called when a response is received. Since it is not always clear when a spaker is finnished talking, the text-to-speech can return several times. As a result the msg.results[0].final variable is used to deremine if the current transcription is final. If it is an intermediate result, we just update the resultsText field with what we heard. If it is the final result, the msg.results[0].alternatives[0].transcript variable is also used as the most likely transcription of what the user said and it is passed on to the converse function.</li>
<li>The converse function handles sending the detected text to the Watson Conversation Service (WCS) via the /message REST interface which was descibed above. When the service gives a response to the question, we pass it to the text-to-speech service via the TTS function and we write it on the <i>response</i> textarea so it can be read as well as listened to.</li>
</ul>
<li>In addition there are many other files which control the look and feel of the web page, but won't be described in detail here e.g. </li>
<ul>
<li>Style sheets in the /public/css directory</li>
<li>Audio sample files in the /public/audio directory</li>
<li> Images in the public/images directory</li>
<li>etc.</li>
</ul>
</ul>
Anyone with a knowledge of how web applications work, should be able to figure out how it works. If you have any trouble, post your question as a comment on this blog.<br />
<ul>
</ul>
<div>
At the time of writing, there is an instance of this application running at <a href="https://speak-to-watson-app.au-syd.mybluemix.net/">https://speak-to-watson-app.au-syd.mybluemix.net/</a> so you can see it running even if you are having trouble with your local deployment. However, I can't guarantee that this instance will stay running due to limits on mypersonal Bluemix account.</div>
Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com4tag:blogger.com,1999:blog-3777125262143150721.post-89834605019859233542017-09-15T07:28:00.000-07:002018-01-08T02:31:43.447-08:00Translating a ChatbotYou have a trained chatbot built in English and your boss wants it working in German next week. Here is what I would do.
<br />
Tell her it is impossible. Building a new chatbot in a different language involves starting the whole process from scratch. Linguistic, cultural and company process reasons means a translated chatbot won't work.
<br />
Then I would build it anyway.
<br />
Create a language <a href="https://console.bluemix.net/docs/services/language-translator/getting-started.html#gettingstarted">translation service in Bluemix</a>. Note the username and password this service has.
<br />
Get the codes of the language you want to translate between. From English 'en' in Sept 2017 you can translate to Arabic 'ar', Brazilian Portuguese 'pt', French 'fr', German 'de', Italian 'it', Japanese 'ja', Korean 'ko', and Spanish 'es'.
<br />
Take your current Ground truth of Questions, Intentions. Put the Questions and intents in a spreadsheet. Sort the list in intentions. So that all the questions about a topic are together.
<br />
Now the python code to translate the questions into German is below. You need to use the username and password you set up earlier.
<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">import json
from watson_developer_cloud import LanguageTranslatorV2 as LanguageTranslator
import csv
language_translator = LanguageTranslator(
username= "",
password= "")
text= ''
with open('myfile.csv', newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in spamreader:
print(row[0], end=',')
translation = language_translator.translate(text=row[0],
source='en',
target='de')
a=(json.dumps(translation, indent=2, ensure_ascii=False))
print(a.strip('\"'))
</code></pre>
myfile.csv here is
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">Am I allowed to speak in german,
Do you speak German?,
Do you speak English?,
What languages can I talk to you in?,
</code></pre>
And this gives the output
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">davids-mbp:translate davidcur$ python translate.py
Am I allowed to speak in german,Bin ich konnte in Deutsch zu sprechen
Do you speak German?,Möchten Sie Deutsch sprechen?
Do you speak English?,Wollen Sie Englisch sprechen?
What languages can I talk to you in?,Welche Sprachen kann ich zu Ihnen sprechen?
</code></pre>
Certain phrases and words you will want to translate in a non standard way. If the chatbot talks about a company called 'Exchange' you will want to warn the translator that it should not translate that into the German word for exchange. To do this you load a glossary file into your translator. <a href="https://gist.github.com/cavedave/5015d2e4bfd7bdf8f09d5457781aff8c">glossary.tmx looks like
this</a>
<br />
and then run code to tell your translation service to use this glossary. In node.js this is
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">
var watson = require('watson-developer-cloud');
var fs = require('fs');
var language_translator = watson.language_translator({
version: 'v2',
url: "https://gateway.watsonplatform.net/language-translator/api",
username: "",
password: ""
});
var params = {
name: 'custom-english-to-german',
base_model_id: 'en-de',
forced_glossary: fs.createReadStream('glossary.tmx')
};
language_translator.createModel(params,
function(err, model) {
if (err)
console.log('error:', err);
else
console.log(JSON.stringify(model, null, 2));
}
);
</code></pre>
The terms in your glossary are likely to be in your Entities as most of your common proper nouns, industry terms and abbreviations end up in there.
<br />
Now get someone who speaks both English and the Target language fluently. Stick these translated questions in the spreadsheet. Go through each question and humanify the translation. This is a pretty quick process. They will occasionally come across words that will have to be added to the glossary.tmx file and the translations rerun.
<br />
At the end of this you have an attempt at a ground truth in another language for a chatbot. There are several reasons why this wont be perfect. Germans do not speak like translated British people. They have different weather, culture and laws. And the differences between Germany and Britain are smaller than between many countries.
<br />
Their questions are likely to be different. But as a first cut to get a chatbot up to the point where it might recognise a fair chunk of what people are saying. This can at least be used to help you gather new questions or to help you classify new questions you collect from actual Germans.
<br />
Manufacturing questions never really works well. As <a href="https://sodoherty.ai/2017/09/09/manufacturing-intent/">Simon pointed out here</a>. And translation of questions is close to that. But it can be a useful tool to quickly get a chatbot up to the point where it can be bootstrapped into a full system.
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com1tag:blogger.com,1999:blog-3777125262143150721.post-9755875921147005802017-09-05T03:58:00.001-07:002017-11-08T06:08:04.239-08:00Watson Speech to Text with Node.jsTalking to a computer makes you feel like your living in the future. In this post I will show how using node.js to expand on the Watson <a href="https://speech-to-text-demo.mybluemix.net/">Speech to Text</a> (STT) example to improve the accuracy of the transcription.
<br />
As a base I will take the <a href="https://conversation-demo.mybluemix.net/">cognitive car demo</a> and try write a STT system for that. Once Speech to text can correctly interpret what a person asks the car we will want to send those commands to The Watson Conversation Service Car demo. But I will deal with connecting STT to WCS in another post.
<br />
Log into bluemix and set up a STT service. Create new credentials and save them as you will use them in all the following programs. This username and password are used everywhere below.
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgq-Vsrc0n_fNO_oZBhd_sey__0ZCc6zVnj_zJXwEubTTI3JXmyNxZOshGtormug48hDcYFaDk7KUJO2eMtpyRc5IimHdEaZ51UuE2s2OuRnWRxgLYOVtNwFshHWZ3vQfHbaQO3_xE8Hvw/s1600/service.png" imageanchor="1"><img border="0" data-original-height="633" data-original-width="1600" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgq-Vsrc0n_fNO_oZBhd_sey__0ZCc6zVnj_zJXwEubTTI3JXmyNxZOshGtormug48hDcYFaDk7KUJO2eMtpyRc5IimHdEaZ51UuE2s2OuRnWRxgLYOVtNwFshHWZ3vQfHbaQO3_xE8Hvw/s400/service.png" width="400" /></a>
<br />
Next assuming you already have node installed you also need the <a href="https://github.com/watson-developer-cloud/node-sdk">watson developer cloud SDK</a>
<br />
npm install watson-developer-cloud --save
<br />
and the <a href="https://github.com/watson-developer-cloud/speech-to-text-utils">speech-to-text-utils</a>
<br />
npm install watson-speech-to-text-utils -g
<br />
<br />
Get an audio file of the sorts of things you would say to this Car demo. I asked questions like the ones below
<br />
<br />
turn on the lights
<br />
where is the nearest restaurant
<br />
wipe the windscreen
<br />
When is the next petrol station
<br />
When is the next gas station
<br />
does the next gas station have rest rooms
<br />
How far are the next rest rooms
<br />
please play pop music
<br />
play rock music
<br />
play country music
<br />
<br />
<br />
I have made an <a href="https://drive.google.com/file/d/0ByfEYKk1FmScNDY5YU9FWjhyWG8/view?usp=sharing">audio file of that</a> (I had a cold). You can get it here or record your own version.
<br />
In speech_to_text.v1.js in the <a href="https://github.com/watson-developer-cloud/node-sdk/tree/master/examples">sdk examples</a> file put in the username and password you got in 1. above. And point at the sound file of commands from 3.
<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">
const speech_to_text = new SpeechToTextV1({
username: 'INSERT YOUR USERNAME FOR THE SERVICE HERE',
password: 'INSERT YOUR PASSWORD FOR THE SERVICE HERE'
});
fs.createReadStream(__dirname + '/DavidCarDemo.wav').pipe(recognizeStream);
</code></pre>
<br />
Now look at transcription.txt. We want to improve the accuracy of this transcription.
Now create a model that we will train so that the STT understands our <a href="https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/?node#custom_methods">speech better</a>.
<br />
<br />
Running node <a href="https://gist.github.com/cavedave/6b0b39216a652cd02d75fb6e82e97ca2">createmodel.js</a> on this code gives you a custom_id needed from here on.
<br />
<br />
<blockquote>
{
"customization_id": "06da5480-915c-11e7-bed0-ef2634fd8461"
}</blockquote>
5. If we tell STT to take in the car demo corpus. It learns from this the sort of words to expect the user to say to the system
<br />
watson-speech-to-text-utils set-credentials
<br />
and give it the username and password same as above.
<br />
watson-speech-to-text-utils customization-add-corpus
<br />
Will ask you for a conversation workspace. Give it
car_workspace.json downloaded from the <a href="https://github.com/watson-developer-cloud/car-dashboard">car demo</a>. Adding a conversation workspace as a corpus improves accuracy of the words that are unusually common in our conversation.
<br />
Now we want to improve accuracy on words that are not common in the corpus. For example "Does the next" is heard as 'dostinex" currently.
<br />
watson-speech-to-text-utils corpus-add-words -i 06da5480-915c-11e7-bed0-ef2634fd8461
<br />
and give words.json which looks like
<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">
{
"words": [{
"display_as": "could",
"sounds_like": [ "could" ],
"word": "culd"
}, {
"display_as": "closeby",
"sounds_like": [ "closeby" ],
"word": "closeby"
}, {
"display_as": "does the next",
"sounds_like": [ "does the next", "dostinex" ],
"word": "does_the_next"
}
]
}</code></pre>
The speech utils allow you to see what words you have in your model. Words added to not overwrite old ones so sometimes you need to delete old values.
watson-speech-to-text-utils customization-list-words -i 06da5480-915c-11e7-bed0-ef2634fd8461
<br />
<a href="https://github.com/watson-developer-cloud/speech-to-text-utils">watson-speech-to-text-utils corpus-delete-word</a> -i 06da5480-915c-11e7-bed0-ef2634fd8461 -w "word to remove"
<br />
<br />
Finally in <a href="https://gist.github.com/cavedave/6b0b39216a652cd02d75fb6e82e97ca2">transcribe.js</a> you need to tell it to use the model we have just made
<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">
const params = {
content_type: 'audio/wav',
customization_id: '06da5480-915c-11e7-bed0-ef2634fd8461'
};</code></pre>
The code I have used is up on github <a href="https://gist.github.com/cavedave/6b0b39216a652cd02d75fb6e82e97ca2">here</a>.
<br />
Now we get a more accurate transcription of the voice commands. By training on the Conversation corpus. Then fixing other common errors using a words.json file.
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<br />
Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com1tag:blogger.com,1999:blog-3777125262143150721.post-34778894882802850912017-08-31T05:37:00.003-07:002017-08-31T08:04:59.897-07:00Combining other services with IBM Watson Conversation Service<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkSu1-LXoY9sjn3iHJjVJsWylfb4Hk8Q0rl5ROlETYONOje6Yj55zTyVSjv-UmZwDo0ZlzwSMvvs71Jq2n6vj8bM7vy_8JV3_2nF2byovXfsutl603efb3Qy-oHziRacUX1QbdUREEj3A/s1600/Conversation_pictogram.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="200" data-original-width="300" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkSu1-LXoY9sjn3iHJjVJsWylfb4Hk8Q0rl5ROlETYONOje6Yj55zTyVSjv-UmZwDo0ZlzwSMvvs71Jq2n6vj8bM7vy_8JV3_2nF2byovXfsutl603efb3Qy-oHziRacUX1QbdUREEj3A/s200/Conversation_pictogram.png" width="200" /></a></div>
It is becoming increasingly popular to offer an interface to computer applications which resembles the way that we converse with a fellow human. The <a href="https://www.ibm.com/watson/services/conversation/" target="_blank">IBM Watson Conversation Service</a> is an excellent way to program such an interface because it allows the developer an easy way to specify the conversation flow and is also very good at doing fuzzy matching on input text to guess what the user is really trying to find out. However, the graphical way that conversation flows are specified doesn't allow the user to make calls to external services in order to get information to be included in the reply.<br />
<br />
People often need to call external services to get the information that their users are looking for and so in this article I describe a simple sample application written by myself and my colleague David Curran which shows a common pattern whereby the conversation service provides a template response along with parameters which can be used by the calling application to retrieve the necessary information to give the end user the answer that they are looking for.<br />
<br />
This pattern is useful in a lot of different situations, but we will use a fictitious application of where people want to use a conversational interface to track their parcels. We will leverage the <a href="https://github.com/watson-developer-cloud/conversation-simple" target="_blank">simple conversation application</a> as a starting point to minimise the amount of work. You can either download that sample and follow the steps below to add the interface to the conversation agent, or if your prefer you can download the completed example from <a href="https://github.com/bodonova/conversation-parcel" target="_blank">our GitHub repository</a>.<br />
<h3>
Adding the Parcel Intent to the conversation</h3>
In order to modify the conversation agent to handle parcel requests, you first need to add a parcel intent to the list of intents. The original sample contains 25 intents which is the maximum allowed with the free plan, so you will need to delete one of the existing intents. I deleted the weather intent since it is not being used and then I added a parcel intent with a few sample inputs as you can see below,<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0c-BH0dYCqmm5qPPhkdHuGPC8jv789PUM7EMo9qH6-RBirxjbkko8xQYXYMuvryn1uxHqug-M0l09JDagp8UiVhZhF8WwtVF2ayRILZW_WI70m56f0rXgQ-Fx1UsBgdrk8IIxqgjiHMo/s1600/Parcel-Intent.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="984" data-original-width="1046" height="301" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0c-BH0dYCqmm5qPPhkdHuGPC8jv789PUM7EMo9qH6-RBirxjbkko8xQYXYMuvryn1uxHqug-M0l09JDagp8UiVhZhF8WwtVF2ayRILZW_WI70m56f0rXgQ-Fx1UsBgdrk8IIxqgjiHMo/s320/Parcel-Intent.png" width="320" /></a></div>
<br />
The next step is to add a node to the dialog to specify how parcel queries are to be dealt with. Our logic is quite simple. If a number is detected in the input we assume that this is the parcel number so we set a context variable <i>parcel_num</i> with this value and then we send back a response message with placeholders where the parcel location should be inserted. However, if no number is detected in the input stream, we simply reply saying that they need to supply us with a parcel number. For simplicity sake we won't consider holding context from one question to the next.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwZBcb7_itcC5UNlVP_FD28BOf9byXiuD_D9sQLoxFcY8szxGxyyq6kNsqKPQrPTeV44g4Rx79VJZpnFiJ_nx5Z5kic4bqY_d3JxY_G6Zq2kIaCTGUPAJl9MxeGDPTqkKBLn5HbHrqJYw/s1600/Parcel-Dialog.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1091" data-original-width="1600" height="435" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwZBcb7_itcC5UNlVP_FD28BOf9byXiuD_D9sQLoxFcY8szxGxyyq6kNsqKPQrPTeV44g4Rx79VJZpnFiJ_nx5Z5kic4bqY_d3JxY_G6Zq2kIaCTGUPAJl9MxeGDPTqkKBLn5HbHrqJYw/s640/Parcel-Dialog.png" width="640" /></a></div>
<br />
<div>
<br /></div>
<h3>
Implementing the dummy parcel lookup service</h3>
<div>
We don't want to use a real parcel lookup service for this sample, because when testing we won't know the parcel number for parcels in transit. Instead we will implement a very simple lookup service.<br />
<br />
To implement the parcel lookup service you need to add the following function near the end of app.js<br />
what this does is respond to get requests on<span style="color: blue;"> <i>/api/parcel</i></span> and respond with one of the sample location names e.g. requesting <i><a href="http://localhost:3000/api/parcel?parcel_num=6">http://localhost:3000/api/parcel?parcel_num=6</a></i> will return the string "Buckingham Palace". Just to illustrate how we should deal with errors, we have implemented the rule that if the parcel number is divisible by 13 it will return a status code 404 and an error message saying that the parcel number is unlucky.<br />
<br /></div>
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">/**
* A dummy parcel tracking service
*/
app.get('/api/parcel', function(req, res) {
var parcel_num = parseInt(req.query.parcel_num);
if (!req.query.parcel_num || isNaN(parcel_num)) {
return res.status(400).end("Not a valid parcel number "
+req.query.parcel_num);
}
if (0 == (parcel_num %13)) {
return res.status(404).end("We can't find parcel number "
+parcel_num+" it is unlucky!");
}
var locations = [
'Anfield', 'Stamford Bridge', 'Old Trafford', 'Parkhead',
'Heathrow Airport', 'Westminister, London', 'Buckingham Palace',
'Lands End, Cornwall', 'John O\'Groats'
];
parcel_num = parcel_num % locations.length;
var location = locations[parcel_num];
res.end(location);
});
</code></pre>
<br />
You should experiment with this service and/or customise it before moving on to the next steps.<br />
<h3>
Recognising a parcel location request and filling in the details</h3>
<div>
The main code modification we need to do is in the <i><span style="color: blue;">app.post('/api/message',</span> </i> function in app.js. However we first need to do some housekeeping changed due to the fact that we will be using the <a href="https://github.com/ranm8/requestify" target="_blank">requestify</a> library.</div>
<div>
<br />
Add the following line to the dependencies section of package.json:
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;"> "requestify": "^0.2.5",
</code></pre>
<br />
Then add this line near the top of app.js
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">var requestify = require('requestify');
</code></pre>
</div>
<br />
The nub of the code is contained in the function below. You should paste this into app.js to replace the call to conversation.message which is around line 56 of the original file.<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;"> // Send the input to the conversation service
conversation.message(payload, function(err, data) {
if (err) {
// the conversation service returned an error
return res.status(err.code || 500).json(err);
}
var parcel_num = data.context.parcel_num;
if (data.intents && (data.intents.length>0) && data.intents[0].intent
&& (data.intents[0].intent === 'parcel') && parcel_num) {
var server = 'localhost';
var port = process.env.PORT || process.env.VCAP_APP_PORT || 3000;
var url = 'http://' + server + ':' + port +'/api/parcel?parcel_num='+parcel_num;
requestify.get(url)
.then(function(response) {
var location = response.body;
data.output.text[0] = data.output.text[0].replace( /\{0\}/g, location);
return res.json(data);
})
.catch(function(err){
data.output.text[0] = "Parcel lookup service returned an error: "+err.body;
return res.json(data);
});
} else {
return res.json(data);
}
});
</code></pre>
<br />
The original code did nothing other than calling the <i><span style="color: blue;">updateMessage</span></i> function before passing the data received from the Conversation service back to the UI layer. However, the <i><span style="color: blue;">updateMessage</span></i> function didn't do anything useful so we can delete it and instead we will call our dummy parcel location service to find the location of the parcel whose number appears in the context variable.<br />
<br />
If the http call succeeds we assume that we have a good location and we replace any placeholder <i><span style="color: blue;">{0}</span></i> strings in the message received from the conversation service with this location. If we get an error status from the http call, we replace the entire string received from the conversation service with a message saying we failed to locate the parcel.<br />
<h3>
Summary</h3>
You have a conversation service which can reply to questions such as "where is my parcel number 543210" with details of where the parcel is located. It is currently only using a toy implementation which pseudo-randomly selects locations in the UK. However, it should be relatively easy to extend it to any real parcel tracing service you want. In fact, the methods used can easily be applied to interfacing with any 3rd party service.<br />
<br />
You can go to <a href="https://github.com/bodonova/conversation-parcel" target="_blank">https://github.com/bodonova/conversation-parcel </a>to download the complete working sample.Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0tag:blogger.com,1999:blog-3777125262143150721.post-76735241734814105162017-08-31T05:10:00.000-07:002017-08-31T05:33:07.128-07:00Sending an Email from Watson Conversation ServiceSending emails is a simple way to connect a chatbot to a business process. This post shows how to extend the Watson Conversation Sample Application to get it to send email from a Gmail account. We will use the <a href="https://nodemailer.com/about/">nodemailer</a> library and a slightly modified version of the code from <a href="https://www.w3schools.com/nodejs/nodejs_email.asp">w3 schools</a> to help us.<br />
<br />
The first step is to download the original sample application <a href="https://github.com/watson-developer-cloud/conversation-simple">from GitHub</a>. Follow the instructions in the Readme file in the repository to get this application running on your local machine and/or on the BlueMix service. Make sure you have the original application working correctly before you go on to make any changes.<br />
<br />
Modify the code below to reference the username and password of an Gmail account you have access to (or <a href="https://accounts.google.com/SignUp?service=mail">create a new account</a>) and then add it near the top of the app.js file.<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">var nodemailer = require ('nodemailer');
var transporter = nodemailer.createTransport({
service: 'Gmail',
auth: {
user: 'chatbot@gmail.com',
pass: 'secretpassword'
}
});
var mailOptions = {
from: 'chatbot@gmail.com',
to: 'name@email.com',
subject: 'Sending Email from the chatbot',
text: 'That was easy!'
}
</code></pre>
<br />
If you use a gmail email and two factor authentication in Gmail you have to get <a href="https://support.google.com/accounts/answer/185833?hl=en">an app password</a> for your gmail address. Google are worried you will share your Gmail password with people so they give you a special password just for your app that has limited powers.<br />
<br />
Next update the Watson application's package.json to say that you want to install the <a href="https://nodemailer.com/" target="_blank">nodemailer</a> package and force the server to use a version of node.js that supports <a href="https://nodemailer.com/" target="_blank">nodemailer</a> i.e.:<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;"> "dependencies": {
...,
"nodemailer": ">0.3.x"
},
...
"engines": {
"node": ">= 6.9.x",
"npm": "> 5.30.x"
}
</code></pre>
<br />
<br />
Having done this, the next step is to change the conversation flow in the Watson Conversation Service so that when the user says ‘send an email’, we will set a context variable called 'email' to say an email should be set. (A production version would probably send different emails to different addresses depending upon context, but we will stick to a simple example for now)<br />
<br />
If your WCS has an intent that you want to send an email. This contains example user utterances like
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">
#sendEmail
Can you email for me?
email dave important info
Ping dave an email to tell him to do important business things
Send an email to find my package
Send an email reminder message
Message Dave a mail to tell him the stuff he needs to do
</code></pre
That when Watson recognises similar intents it should know to send an email.
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN0oAhA_U2TVpiTx-4m34_HNcRAadzPodSy_Zj_7Xo87u_i9W4iqrkCNI6raNHZDL-zQIas6owOsWnYx711-DdAUa9f_U6WN5DX0s_DwrilsbaDngObdfS6xDCVgoGoWSKnFygFWLv_v8/s1600/intent.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN0oAhA_U2TVpiTx-4m34_HNcRAadzPodSy_Zj_7Xo87u_i9W4iqrkCNI6raNHZDL-zQIas6owOsWnYx711-DdAUa9f_U6WN5DX0s_DwrilsbaDngObdfS6xDCVgoGoWSKnFygFWLv_v8/s320/intent.png" width="320" height="251" data-original-width="1425" data-original-height="1116" /></a>
<br />
In WCS dialog tab we want a node that recognises this intent and sets a context variable to signal to the node application to send an email.
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhW04PQwj3ekWFfCVPXaiPty3a6h0QE1NsOsd3Uw6Gih2TGDi0cyzfqVKKMAvAC9npbWfdRd3JWtx8_QWej69n1MIVKtP7vpzdfzdFiCNOw_8DRpHdhAyR_F6V9R6_0YNbGqyW4UZeZ2YI/s1600/dialogedit.png" imageanchor="1" ><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhW04PQwj3ekWFfCVPXaiPty3a6h0QE1NsOsd3Uw6Gih2TGDi0cyzfqVKKMAvAC9npbWfdRd3JWtx8_QWej69n1MIVKtP7vpzdfzdFiCNOw_8DRpHdhAyR_F6V9R6_0YNbGqyW4UZeZ2YI/s400/dialogedit.png" width="600" height="291" data-original-width="1600" data-original-height="762" /></a>
<br />
In the json editor of the dialog node add a context variable for the email.
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">
{
"context": {
"email": "TRUE"
},
"output": {
"text": {
"values": [
"Ill email david now"
],
"selection_policy": "sequential"
}
}
}
</code></pre>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgXlg9pUDEWlgwDaX3VxAizcxZ-0VpabelnyOTO_X1zqE1atfmhZjA21cKtqMEHwCy0XQwPI7nYxEoxmtBXIWo1JgpxG9rcjjvizj5Yvu6XlwUmueLco1a1j2T5e40aeO3H3FC5DGudke0/s1600/sendEmail.png" imageanchor="1"><img border="0" data-original-height="602" data-original-width="830" height="350" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgXlg9pUDEWlgwDaX3VxAizcxZ-0VpabelnyOTO_X1zqE1atfmhZjA21cKtqMEHwCy0XQwPI7nYxEoxmtBXIWo1JgpxG9rcjjvizj5Yvu6XlwUmueLco1a1j2T5e40aeO3H3FC5DGudke0/s320/sendEmail.png" width="500" /></a>
<br />
<br />
Next, you need to change app.js to say that when this ‘send an email' context variable is send it. At the start of the function <i>updateMessage</i> add in the following code check to see if the context variable email from WCS' response has been set and send an email if it has.<br />
<br />
<pre style="background: #f0f0f0; border: 1px dashed #cccccc; color: black; font-family: "arial"; font-size: 12px; height: auto; line-height: 20px; overflow: auto; padding: 0px; text-align: left; width: 99%;"><code style="color: black; word-wrap: normal;">if(response.context.email){
response.context.email=null;//set send variable to null so we don't send email every time
transporter.sendMail(mailOptions, function(error, info){
if (error) {
console.log(error);
} else {
console.log('Email sent: ' + info.response);
}
});
}</code></pre>
<br />
<br />
You can download the complete sample from <a href="https://github.com/cavedave/conversation-simple">here</a><br />
<br />
This code just sends the same email every time. In practice you would probably have to include some information from the conversation in the mail. Usually this would something from context variables. You would add in the email information the user had told the chatbot such as the value in response.context.query
<br />
<br />
I am not a node developer and this code needs some error correction and security checks added to it. But for a quick demo it should allow you to show your chatbot emailing.
<br />
<br />Iamreddavehttp://www.blogger.com/profile/02768287658329807075noreply@blogger.com4tag:blogger.com,1999:blog-3777125262143150721.post-44532706649792182602017-08-31T03:00:00.000-07:002017-08-31T05:16:43.812-07:00Welcome to this BlogIBM Watson services are amazing. Most people who saw <a href="http://www.techrepublic.com/article/ibm-watson-the-inside-story-of-how-the-jeopardy-winning-supercomputer-was-born-and-what-it-wants-to-do-next/" target="_blank">Watson winning the Jeopardy game on television</a> thought that they would love to have that advanced technology applied to their problems. Unfortunately advanced technology can sometimes be complex and people struggle to see how they can adapt the IBM Watson services to their needs. As a result we decided to establish this blog which will focus on tips and tricks to help you get the most out the IBM Watson services.<br />
<br />
It is not an official blog so there is no guarantee that the tips will continue to work. Treat the advice with caution and use the tips at your own risk.Brian O'Donovan (BOD)http://www.blogger.com/profile/10652457466416412942noreply@blogger.com0