AI

Navigating the Vocabulary of Generative AI Series (2 of 3)

This is my 2nd post in this series of ‘Navigating the vocabulary of Gen AI’, and in this post I continue and follow on from the first post I made here where I provided an overview of the following AI terminology:

  • Artificial Intelligence
  • Machine Learning
  • Artificial Neural Networks (ANN)
  • Deep Learning
  • Generative AI (GAI)
  • Foundation Models
  • Large Language Models
  • Natural Language Processing (NLP)
  • Transformer Model
  • Generative Pretrained Transformer (GPT)

Responsible AI

Responsible AI is designed to set out the principles and practices when working with artificial intelligence to ensure that it is adopted, implemented and executed fairly, lawfully, ethically ensuring trust and transparency is given to the business and its customers.  Considerations to how AI is used and how it may affect humanity must be governed and controlled by rules and frameworks.  Trust, assurance, faith and confidence should be embedded with any models and applications that are built upon AI. 

Labelled Data

Labelled data is used to help machine learning models and algorithms process and learn from raw material.  The data is ‘labelled’ as it contains tags and features associated with the target data which provides useful and informative information about it, for example if you had a photo of a tiger, it could be labelled with ‘Tiger’. This helps to provide context to the raw data which the ML model can then use and extract to help it to learn and recognise other images of tigers.  This raw input data can be in the form of text, images, videos and more and requires human intervention to label the data correctly.

Supervised learning

Supervised learning is a training method used within machine learning which uses a vast amount of labelled datasets in order to be able to predict output variables.  Over time, the algorithms learn how to define the relationship between the labelled input data and the predicted output data using mapping functions.  As it learns, the algorithm is corrected if it makes an incorrect output mapping from the input data, and therefore the learning process is considered to be ‘supervised’.  For example, if it saw a photo of a lion and classified it as a tiger, the algorithm would be corrected and the data sent back to retrain.

Unsupervised learning

Unsupervised learning differs from supervised learning in that supervised learning uses labelled data, and unsupervised learning does not.  Instead it is given full autonomy in identifying characteristics about the unlabeled data and differences, structure and relationships between each data point.  For example, if the unlabeled data contained images of tigers, elephants and giraffes, the machine learning model would need to establish and classify specific features and attributes from each picture to determine the difference between the images, such as colour, patterns, facial features, size and shape.

Semi-supervised learning

This is a method of learning that uses a combination of both supervised and unsupervised learning techniques and so uses both labelled and unlabeled data in its process.  Typically when using this method, you have a smaller data set of labelled data compared to a larger data set of unlabelled data, this prevents you having to tag a huge amount of data.  As a result this enables you to use the smaller set of supervised learning to assist in the training of the model and so aids in the classification of data points using the unsupervised learning technique.  

Prompt Engineering

Prompt engineering allows you to facilitate the refinement of input prompts when working with large language models to generate the most appropriate outputs.  The technique of prompt engineering enables you to enhance the performance of your generative AI models to carry out specific tasks by optimising prompts.  By making adjustments and alterations to input prompts you can manipulate the output and behaviour of the AI responses making them more relevant. Prompt engineering is a principle that is allowing us to transform how humans are interacting with AI.

Prompt Chaining

Prompt chaining is a technique used when working with large language models and NLP, which allows for conversational interactions to occur based on previous responses and inputs.  This creates a contextual awareness through a succession of continuous prompts creating a human-like exchange of language and interaction.  As a result, this is often successfully implemented with chat-bots.  This enhances the user’s experience by responding to bite-sized blocks of data (multiple prompts) instead of working with a single and comprehensive prompt which could be difficult to respond to.

Retrieval augmented generation (RAG)

RAG is a framework used within AI that enables you to supply additional factual data to a foundation model as an external source to help it generate responses using up-to-date information.  A foundation model is only as good as the data that it has been trained on, and so if there are irregularities in your responses, you can supplement the model with additional external data which allows the model to have the most recent, reliable and accurate data to work with.  For example, if you asked ‘what’s the latest stock information for Amazon’ RAG would take that question and discover this information using external sources, before generating the response. This up-to-date information would not be stored within the associated foundation model being used

Parameters

AI parameters are the variables within a machine learning model that the algorithm adjusts during training to enable it to optimise its performance to generalise the patterns from data, and therefore making them more efficient. These values dictate the model’s behaviour and minimise the difference between predicted and actual outcomes.

Fine Tuning

Fine-tuning is the technique of adjusting a pre-trained model on a particular task or data set to improve and enhance its performance.  Initially trained on a broad data set, the model can be fine-tuned using a smaller, and more task-specific data set. This technique allows the model to alter and adapt its parameters to better suit the nuances of the new data, improving its accuracy and effectiveness for the targeted application.

In my next post I continue to focus on AI, and I will be talking about the following topics:

  • Bias
  • Hallucinations
  • Temperature
  • Anthropomorphism
  • Completion
  • Tokens
  • Emergence in AI
  • Embeddings
  • Text Classification
  • Context Window
Stuart Scott

Stuart has been working within the IT industry for two decades covering a huge range of topic areas and technologies, from data center and network infrastructure design, to cloud architecture and implementation. To date, Stuart has created 100+ courses relating to Cloud reaching over 120,000 students, mostly within the AWS category and with a heavy focus on security and compliance. Stuart is a member of the AWS Community Builders Program for his contributions towards AWS. He is AWS certified and accredited in addition to being a published author covering topics across the AWS landscape. In January 2016 Stuart was awarded ‘Expert of the Year Award 2015’ from Experts Exchange for his knowledge share within cloud services to the community. Stuart enjoys writing about cloud technologies and you will find many of his articles within our blog pages.

Recent Posts

Get 50% off with the Cloud Academy’s Flash Sale!

It's Flash Sale time! Get 50% off your first year with Cloud Academy: all access to AWS, Azure, and Cloud…

1 month ago

New AWS Certified Data Engineer – Associate (DEA-C01) exam goes live on March 12th, 2024!

In this blog post, we're going to answer some questions you might have about the new AWS Certified Data Engineer…

2 months ago

Navigating the Vocabulary of Generative AI Series (3 of 3)

This is my 3rd and final post of this series ‘Navigating the Vocabulary of Gen AI’. If you would like…

4 months ago