Add components chapter

fhdsl · Dec 13, 2023 · b652185 · b652185
1 parent d01fdae
commit b652185
Showing 1 changed file with 199 additions and 21 deletions.
diff --git a/03b-Determining-AI-Needs-components.Rmd b/03b-Determining-AI-Needs-components.Rmd
@@ -1,35 +1,213 @@
 
-# (PART\*) Determining AI Needs {-}
 
-```{r, include = FALSE}
-ottrpal::set_knitr_image_path()
+# What are the components of AI?
+
+## Learning objectives:
+
+- Understand what makes a good AI model
+- Describe what makes a model accurate
+- Understand fundamentals about what makes AI models computationally efficient
+- Describe components of LLMs and other AI models and how training data is critical to their accuracy
+
+## Intro
+
+What makes the AI chatbots’ performance today so vastly improved from previous chatbots? Like those that resembled office supplies and helped us write documents? In this chapter we’ll discuss some generalities of how AI works and what makes an AI tool good.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g1965a5f7f0a_0_44")
+```
+
+A good AI model is accurate – you need it to give answers that are correct or at least useful. They are also computationally efficient because we need them to give the answer back in a reasonable amount of time. We also don’t want to spend tons of money on the computation it takes for the chatbot to work.
+
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g4b351d4d791f2af2_0")
+```
+
+## What makes an AI model accurate?
+
+Let’s talk about the basics of what makes an AI model accurate. In order to understand this, we need to discuss some principles behind Machine learning.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_22")
+```
+
+Picture you were teaching someone (like an AI model) to identify apples from bananas. The training data you might give them would be a series of apples and bananas and you would label which were bananas versus apples.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_26")
+```
+
+You could then test the model’s abilities ability to identify apples and bananas based on this training by giving them a fruit to identity. Assuming the fruit you gave them is reasonably identifiable from their training, they should accurately identify an apple.
+
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_250")
+```
+
+However, if the test you give the model is outside the kind of data they were trained on, they might not do well with it. For example if you didn’t provide any green apples and then you test the model with a green apple. It may or may not succeed.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_42")
+```
+
+To address this gap in the model's knowledge, you might add supplementary training data and retrain it so that it understands that apples could also be green.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_273")
+```
+
+
+However, this added training data may help for the identification of green apples, but if given something similar to an apple but not -- say a pear. It may incorrectly identify a pear as an apple if it hasn't ALSO been trained on pears.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_216")
+```
+
+
+This may feel silly to you -- why couldn't it identify a pear -- but this is because you are a really well trained AI. (Actually just the I, you presumably aren't artificial). You've seen lots of fruit in your life -- you've collected a lot of training data on this task and have no problem identifying a pear from an apple.
+
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g4b351d4d791f2af2_4")
+```
+
+But we could throw you off too.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_238")
+```
+
+When you look at this image of a hybrid apple-banana, if AI models could feel, this is how they would feel.
+
+
+## What makes an AI model efficient?
+
+Let’s talk about the basics of what makes an AI model accurate. In order to understand this, we need to discuss some principles behind Machine learning.
+
+Let's return to apples.
+
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_112")
+```
+
+With the above image, you don't need much time to look at that picture and know that that is an apple. You don't have to think about this for very long.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_117")
+```
+
+With the above image, you don't need much time to look at that picture and know that that is an apple. You don't have to think about this for very long. You didn't take in one piece of information at a time.
+
+This type of information processing is what neural networks are based on. Neural networks are when computers mimic how brains work to process information.
+
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_125")
+```
+
+Think about how you'd read the following paragraph:
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_129")
+```
+
+**Did you read each word, in order from start to end?**
+
+OR
+
+**Did you pick out keywords by skimming and getting the gist? Maybe later going back to pick up context you missed?**
+
+The old way AI models worked is that they would read sequentially – from start to finish. And as you may sense, that is a slower way to read.
+
+Alternatively, the new algorithms often use Attention mechanisms. These algorithms work analogous to skimming the input text.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_305")
+```
+
+However, you could also probably sense that just because the new way of attention mechanisms are faster doesn’t mean that for all uses they are more accurate – by skimming you sometimes can miss important information.
+
+Regardless of that, let’s walk through more about this analogy to get a sense of how attention mechanisms can work.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_157")
+```
+
+First we might highlight keywords in this paragraph. And meanwhile the words and phrases that are processed would be chunked into units called “tokens” the most important tokens we would focus on first with those attention mechanisms we referred to.
+
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_146")
 ```
 
-# Introduction to Determining AI Needs
+When we connect these relationship between these words we might already start pulling out some of the meaning of this paragraph.Grabbing these relational words will help us piece together more meaning.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_168")
+```
+
+Lastly we might pull out some contextual information from the other words we left behind.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_199")
+```
+
+Let's here it straight from an AI model. We asked bard to tell us what phrases it would pull out as keywords with attention mechanisms if we gave it this paragraph.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_189")
+```
+
+Without these recent advancements in attention mechanism algorithms, the large language models that we see today would not be possible. Its these computationally efficient mechanisms that have allowed large language models to be possible in addition to the physical hardware improvements in computing.
+
+## Putting it together
+
+In summary, a good AI model is accurate -- this is largely determine by its training data being high quality, relevant and properly processed.
+
+A good AI model is also computationally efficient. We need to use algorithms that can efficiently and properly process data.
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a3ef1dce0a_0_310")
+```
+
+
+In order to visualize this, we've made a fake machine learning machine to describe AI. AI models can take a lot of different forms and functions and this visual is merely a tool to understand generalities about components of AI. It is not meant to be a detailed representation of any given AI model.
+
+But we can discuss AI tools in terms of their:  
+
+- *input* what is the user of the tool providing?
+- *processing (including algorithms)* -- what are we going to do with that input?
+- *training data* - how was the mode trained? what information was it trained on?
+- *output* - what are we returning to the user of this AI tool?
+
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a61bc12ef8_0_0")
+```
+
+Each of these components can get very complicated very quickly. Although we won't go through the details of these in this course, we will discuss practical aspects of these in terms of customization for AI needs.
 
-## Motivation
 
-There are a ever increasing number of options, strategies and solutions for integrating AI solutions into a project. It can feel overwhelming to understand what these options entail let alone understand how to decide what solution best fits a use case.
+Large language models are one popular type of AI tool. So we can talk about the components of these models in the context of this visual.
 
-In this course we aim to give individuals the basic info they need to make basic plans for integrating AI tools into their project.
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a678430d60_0_153")
+```
 
-## Target Audience  
 
-The course is intended for individuals who have an AI related project in mind or think that they might need to incorporate AI into their project. They are likely the leader who is guiding others in an AI related project and not necessarily the person who will write code or carry out the technical aspects of the project. T
+Large language models are one popular type of AI tool. So we can talk about the components of these models in the context of this visual. Tokens are units of a language (these might be words or phrases). Transformers are what organize tokens to find the meaning/context. Meanwhile to do this processing tokens are coded as Embeddings these are numerical representations of tokens. Encoders are what processes input text from a user. Meanwhile Decoders generate output text that is sent back to the user.
 
-## Curriculum  
+In summary:
 
-**What this course covers:**
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a678430d60_0_0")
+```
 
-- What are the practical aspects of AI that need to be understood before endeavoring on an AI project?
-- What makes an AI model good?
-- How do you determine what kinds of custom AI solutions your project needs if any?
-- What aspects of your resources and your project should you consider when evaluating AI strategies?
-- What would better suite your needs an "out of the box" AI product or building an AI model solution "from scratch"?
-- Examples of currently existing AI solutions that may suit an individual's AI needs.
+One more important point about AI models. Their training and training data is critical. You have likely seen and heard about many biased things that large language models have said. This is because the language they were trained on – the language of human beings in our society – was also very biased.
 
-**What this course does NOT cover:**
+```{r, out.width = "100%", echo = FALSE}
+ottrpal::include_slide("https://docs.google.com/presentation/d/1COHDxEwy9GwXAgUJLBqjDjWm-lqdKy2n3Qds4ivM4UA/edit#slide=id.g2a61bc12ef8_0_169")
+```
 
-- This is NOT a comprehensive survey of the AI tools and products in existence. Even if it was comprehensive at this time, there are new tools and developments constantly arriving. We merely give examples of solutions that show a *possible* AI solution. There may be competitors or similar solutions out there that would even better fit a project's needs.
-- This does NOT cover in depth aspects of algorithms, statistics or mathmetatics behind AI algorithms -- these are numerous and not always necessary to understand in fine detail for making decisions about projects.
-- This does NOT cover how to complete or write code for an AI project. This is not a tutorial for building an AI tool. Instead we merely give stategies you could employ but we do not give details on how you might employ them. There are too many ways that AI tools may be built -- this is outside the scope of this course. 
+To summarize, for AI models can only be as good as their training. So garbage training data in means biased garbage as output.