Thursday, 12 June 2025

What James Joyce Can Teach Us About Chatbot Quality

Here are some simple metrics you can use to assess how good a chatbot’s output is.  Or as its nearly Bloomsday to tell what era of James Joyce you are reading.

Before using a new chatbot, it’s helpful to get a sense of its output quality and where it may fall short. Some of this can be quantitative, numbers you can use to make comparisons, and some will always be qualitative, based on your own close reading.

Or for the more literary: before reading a piece by Joyce, it might be handy to know whether you're getting the clarity of The Dead, the experimentation of Ulysses, or whatever it is that's going on in Finnegans Wake.


The Metrics

The Five Simple Metrics We'll Use

Word count: 

LLMs can be verbose. if a new model has outputs much longer than an earlier one it may be inefficient.

Average sentence length

As well as just raw number of words really long sentences are a common issue. A sudden increase in sentence length is likely a red flag. 

Stopword ratio

The glue words in the language 'the, a, at, and...' are essential but if overused can indicate that output of a model could have lots of filler.  

Flesch Reading Ease

A score based on sentence length and syllables per word. Higher means easier to read. 

Repetition ratio

Counts the number of unique words compared to total words. Low ratio means high numbers of unique words which could be quite a complicated to read. Very low numbers could be quite robotic. 

Most of these metrics can raise suspicions about a text if they are unusually high or low. They serve as a guide on where to look for issues rather than being guaranteed to find issues.

As simple metrics these will be wrong more often than nuanced ones. But I will use them to show some of the shortcomings that typically occur in LLM models and how these can be found.

Example Joycean Usage




Lets take three works by James Joyce and see how they look to these metrics. Before we start lets say text from 'The Dead' is good. From one of Ulysses more lively passages is not as good. And Finnegans' Wake is really not the sort of thing you want a model outputting. Though of course this might be different to your views and requirements.

The Dead


MetricValue
Word Count2196
Character Count10206
Avg Sentence Length23.87
Stopword Ratio0.44
Flesch Reading Ease88.36
Repetition Ratio0.25

Clear, direct prose. Short sentences. Good word variety. This is our baseline.

Oxen of the Sun (Ulysses)


MetricValue
Word Count1904
Character Count10165
Avg Sentence Length29.29
Stopword Ratio0.47
Flesch Reading Ease61.50
Repetition Ratio0.18

Sentences are longer, stopwords increase, and readability drops. The style is denser and less repetitive.

Finnegans Wake

MetricValue
Word Count1659
Character Count10160
Avg Sentence Length18.64
Stopword Ratio0.37
Flesch Reading Ease70.73
Repetition Ratio0.09

Finnegans Wake has shorter sentences and a relatively readable Flesch score, but extremely low repetition, suggesting high vocabulary churn. This shows some of the limitations of a metric based on syllable counts.






Conclusion

If you had a model that originally produced Dubliners style outputs and suddenly started writing like Ulysses or Finnegans Wake, these metrics would flag the change immediately.

They won’t catch every issue, but they’ll give you a fast and useful first pass.

Try it yourself at https://optamam.com/metrics Paste in some chatbot output or a literary excerpt and see what you find



By David Curran



Wednesday, 4 June 2025

Google Gemini


Gemini is Google's competitor to ChatGPT from OpenAI. It is independently developed, but is shares many architectural similarities - it is a generative AI that uses a Large Language Model (LLM).

Google claims that Gemini is a state of the art standalone AI capability, but its real strengths are in the way it has been tightly integrated into other Google offerings.

For example:
  • If you search on google, the top of the search results will now feature an AI generated concise article on the topic you queried.
  • When you receive an email via GMail, Gemini will offer to draft a reply (this is a dangerous thing to trust).
  • Gemini will offer to summarise long emails and/or documents written using Google documents.
  • etc. 

Gemini has a freemium model which means that the technology is available for light usage for free, but a subscription fee of about $20 per month must be paid for heavy usage.The usage limits of the free tier are outlined here. Developers need to buy tokens which allow a certain level of activity - the detail is described here.

Tuesday, 7 March 2023

Deleting all documents from a Watson Discovery collection


Sometimes when you are cleaning up a collection, you might want to delete all of the documents. It is tedious to do this by clicking on the files in the UI. Therefore I wrote a simple script del_docs.ipynb to do this for you.

To run this script, edit the panel containing the collection details and then jus run the notebook.

By default the script will delete all documents, if you want to only delete a subset of documents then alter the query variable to specify the selection formula for the documents you want to delete.

Monday, 4 July 2022

Automatically emailing a copy of conversation history

Many chatbots offer the option of emailing the user a copy of the conversation history. Traditionally this has been done by writing code server side, but with the latest version of the web widget associated with Watson Assistant it is really easy to implement this with a small bit of client side code.

The code to implement this can be downloaded from GitHub. What you need to do is:

  1. Place the code contained in the history_template.html file somewhere in the header part of your web site's page. 
  2. Modify the INTEGRATION_ID, REGION_ID and INSTANCE_ID variables to match your Watson Assistant instance.
  3. Create an account on elasticmail.com if you don't already have one. Then insert the SMTP_HOST, SMTP_USERNAME, SMTP_PASSWORD and FROM_ADDRESS in the javascript code.
The code is fairly easy to understand, but here are a few pointers:
  • The start and end of the code block is the same as the web widget which is provided for you by the Watson Assistant UI. In fact accessing this widget is the easiest way to get the correct values for the integrationID:, region:, and serviceInstanceID: variables.
  • The code registers a number of event handlers:
    • The histHandler() function gets called when the chat window is opened. The event passed contains the history which has been stored client side. This is used to construct the initial email contents and store it in the global variable historyText.
    • The msgSent() and msgReceived() functions are called when a message is sent by the end user and by Watson respectively. They update the global variable historyText.
    • Last, but not least the msgSent() event handler gets called whenever the chat window is closed/minimised. This calls the sendEmail() function described below.
  • The sendEmail() function implements the actual sending of the email. The documentation for this javascript library suggests that any arbitrary SMTP server can be used, but I found that it wouldn't work with any SMTP agent apart from smtp.elasticemail.com. The value of the From: variable indicates where the email will appear to be coming from - this will have to be the email address you used to create the account on ElasticMail. Be careful what address you use - this is where any replies will be sent.
If you want to customise the look and feel of the chat history emails, edit the variable baseHistoryText and/or change the watsonMsg() and userMsg() functions.

I hope you find this sample useful. As always this code is provided on a As-Is basis. No warranty or support is offered.

Thursday, 21 May 2020

Video Tutorials on chatbot Making

I have started making videos explaining the chatbot making process. These are not Watson specific. More the steps needed in general.





Friday, 8 May 2020

Understanding Intents and Entities

One of the most important things that any conversational agent needs to do is to figure out what intent(s) and entities are contained in the user's input. An intent is what the user wants to do and the entities are the things that they want to do it with or too. This can be confusing so this post tries to explain with a simple example.

To illustrate we will consider a chatbot written for a second hand electronics store. The samples of expected user input might be something like:

  1. I want to sell my old iPhone
  2. I need a new laptop
  3. I am interested in buying an new iPad
In examples 2 and 3 the user's intent is to buy, while in example statement number 1 the intent is to sell. When designing a conversational agent you define the intents by giving examples of what you expect users with this intent to say. In a real system you would need to give many more examples to give Watson a better chance of  guessing the users' intent, but this is enough to illustrate the way it works..

There are different ways to specify entities. The most common one is to give samples of the entity values. For example, if we had an entity @device and the possible values were 'phone', 'tablet' and 'laptop' - we would specify the entity as shown below by giving examples of what the user might say for each type of entity.


The second way of specifying an entity is with a regular expression. This is useful in cases where you want to catch an email address or phone number. In this cases it is not feasible to exhaustively list all possible inputs, but the rules for what an email address or phone number look like are easy to specify in a regular expression.

The third option for entities is to use one of the predefined system entities. For example, you might use the @sys-date entity a simple way to capture mentions of a date without having to go to the trouble of specifying a complex regular expression. This also has advantages such as when the user types 'tomorrow' the entity extracted is the correct date for the day after they typed it.

Many developers of chatbots don't realise that you can combine entities with intent examples to make them more powerful. If you specify your examples like this:


  1. I want to sell my old @device
  2. I need a new @device
  3. I am interested in buying a new @device
This saves you the trouble of repeating the same sentence for each type of device that the users might want to buy or sell.

Wednesday, 11 September 2019

Convert Watson csv's to RASA yml

Watson allows you to export the ground truth (questions and intents) RASA expects data in a slightly different format.
Watson is in the form
Question, Intent
Can I buy a sandwich?, #buy_sandwich


RASA in the form
## intent:buy_sandwich
- Can I buy a sandwich?

This is yml format but RASA calls it .md. The code for this conversion from Watson format to RASA is below

import pandas as pd
 questions = pd.read_excel(filePath,names=['Intent','Question'])
file = open("rasaoutput.md","w", encoding="utf-8")  
 
for x in labels:
    #print the intention name in rasa way
    file.write("## intent:"+x+"\n")
    #then print every question in the dataset with that label
 
    y=questions[questions['Intent'].str.contains(x)]['Question']
    #change series into an array
    z=y.values
    i = 0
    while i < len(z):
        file.write(str("- "+z[i]+"\n"))
        i=i+1

    -David