Teaching AI in K-12
In this presentation from the 2023 VEX Robotics Educators Conference, Dr. David Touretzky from Carnegie Mellon University discusses artificial intelligence (AI) and how robotics education could be changed by the technologies emerging around artificial intelligence. Dr. Touretzky discusses deep neural networks, foundation models for AI, and how educators can approach teaching AI in their classroom.
                                                            (warm music)
I've obviously looked forward to every presentation that we've had during the conference. It's pretty much been unanimous that everyone has joined every presentation that we've had during the conference. But, you know, if I had to say, this is probably one of the presentations I've been most looking forward to, and that is not just because I've been playing nonstop on ChatGPT for the last three weeks. But I think obviously as educators, artificial intelligence is something that we have a lot of questions about. AI is something that we might either be worried about, anticipating, can't wait for, all of the above.
But I'm very proud to say that in my humble opinion we have here at our conference this afternoon Professor David Touretzky, who is the foremost expert on artificial intelligence, specifically as it pertains to K-12 education. So it's my pleasure to introduce you to Professor David Touretzky from Carnegie Mellon University.
(audience applauding)
Thank you, Jason. It's a pleasure to be here to talk to you today. Maybe the mic gain can be turned down just a little bit. So how many people here have not played with ChatGPT? So more than I thought, but the rest of you have, right? Okay, so we're living in an amazing time. ChatGPT, these large language models, they're going to change everything and people are aware of that. They're excited by it. They don't yet know exactly how things are changing, but we know that they're changing. It's kind of like standing next to Alexander Graham Bell when he got the telephone to work, or next to Marconi when he made the first successful wireless transmission. We are living in a historic time, and that's leading to AI anxiety. There was a piece in the Wall Street Journal last week about AI grief. But there's also tremendous excitement, and so I want to focus on the positive today and talk to you about teaching AI in K-12.
And I want to address five questions. What is artificial intelligence? What are deep neural networks? What are foundation models? How will robotics education be changed by these technologies? And what can I do now to prepare? So let's just jump into it here. Your students may ask you, "What is artificial intelligence?" And here's an answer that you can give them. Artificial intelligence is, first of all, it's a branch of computer science. It is concerned with techniques that allow computers to do things, okay? So psychology is concerned with how people think. Artificial intelligence is concerned with how to get computers to do things, specifically things that when people do them are considered evidence of intelligence. So that's the definition of AI that we're going to work with.
We're already surrounded by AI technologies. You use AI every day, speech recognition when you talk to Alexa, automated subtitles on videos, that's AI doing speech recognition behind the scenes. Computer vision. You may not be brave enough to sit in the back of your Tesla while it's driving, but many of your cars have automated lane departure warnings, for example. That's computer vision. You unlock your phone with your face. That's computer vision. Language understanding, if you ask Google, "What's the second largest city in Honduras?" and it gives you the answer, there's artificial intelligence behind that. Google Translate, you can translate between any of 112 languages. My favorite Chinese restaurant had an insert in my takeout that was written in Chinese. And I can't read Chinese, so I just pointed my phone at it and it translated it to English for me. That's AI. All these recommender systems when Amazon's trying to sell you more stuff, Netflix is trying to come up with movies for you to watch, all that's based on their model of what they think you're interested in. That's AI.
And of course, in robots, many people have Roombas cleaning their houses. Amazon's got these Kiva robots in their warehouses. All of these things are powered by AI. So we're already living in an AI world, but it's about to get a lot more wild.
AI does pose challenges for society. Many types of work are going to become automated. Some types of jobs will be eliminated. More common will be that just the way people do those jobs will change. A lot of people will be able to be more effective, more productive because AI will be assisting them.
Other places where things can go wrong include people using AI systems to make decisions about people, such as whose resume gets reviewed, who gets a mortgage, and who gets admitted to prestigious schools. If AI systems are making these decisions that affect people's lives, there are questions about whether they are biased unfairly, if they are transparent, and if somebody is taking accountability for what these systems do. These are ethical questions that have to be addressed by our society.
AI is powering the surveillance state. China's a leader in this area, with face recognition cameras on all the street corners. But the US is also in danger of this kind of surveillance state mentality, and we have to decide if we're going to live with that or not. Then there are things like deep fakes that are undermining our confidence in audio and video recordings as being truthful. So it's not all good; AI is both good and bad, but these are issues that are here today, right now.
I'm the founder and chair of ai4k12.org. We were funded by NSF, and our mission was to establish national guidelines for teaching AI in K-12. So how do you teach AI in K-12? There are computer science standards that CSTA, the Computer Science Teachers Association, published. The most recent version was in 2017, and the entire document contained just two sentences about AI. They just didn't address it. So we came along and said, "Let's remedy this."
Our work was co-sponsored by CSTA and by AAAI, the Association for the Advancement of Artificial Intelligence. We set to work trying to figure out what K-12 students should know about AI and what they should be able to do with it. One of the first things we did was release a list of five big ideas in AI, as you can see on the slide here, and a little infographic. This is all now available on a poster, which has been translated into 16 languages because there's so much interest in teaching AI at the K-12 level.
I want to take you through the five big ideas very quickly.
Music Cue
Big idea one: perception. Computers perceive the world using sensors. We make a distinction between sensation and perception. Sensation is the raw signal, while perception is extracting meaning from the sensory signal. Here you see how a self-driving car sees the world. The camera does the sensing, recording the image, but the perception—seeing pedestrians, vehicles, traffic lights, and street signs—is where the AI comes in. That's the perception part. Is it possible to have sensing without perception? Absolutely. That's why you get deer in the supermarket, because you've got this automated door with a sensor, but there's no perception. If there was perception, you wouldn't let the deer in. But without perception, the deer can come in.
Big idea two is representation and reasoning. Agents maintain representations of the world and use them for reasoning. If you teach computing, this is analogous to having data structures and algorithms. The representations are the data structures, and reasoning is the algorithm. The illustration here is from the famous Go match between Lee Sedol, the Go Grand Master, and Google's DeepMind. DeepMind beat the best Go player on the planet a few years ago. Obviously, DeepMind had some representation of the board, and the reasoning algorithm was how you choose the next move.
Music Cue
Thank you for your attention and interest in AI education. We hope to inspire the next generation to understand and innovate in the field of artificial intelligence.
Big idea three is learning that computers can learn from data. This is what's really powered the AI revolution that's taken place over the last 10 to 15 years. Machine learning suddenly became very powerful, and when applied to other AI problems, it made a lot of other parts of AI work much better than before. For example, speech recognition has been a focus since the 1960s, but it didn't work very well until machine learning techniques became powerful enough to make it effective. The same is true for computer vision, language understanding, and many other areas. Some people conflate machine learning with AI, but they're not the same thing. AI started in the 1950s, while machine learning really began in the 1980s as a sub-discipline of AI. Then, in the 2010s, deep neural networks emerged as a particular approach to machine learning, making things like speech recognition work. The fact that you can talk to Siri now is because of this progress in the field powered by machine learning.
Big idea four is natural interaction. Intelligent agents require many kinds of knowledge to interact naturally with humans. I'll be honest, this is kind of the big carpet under which we swept a lot of stuff because we wanted to have five big ideas. This includes natural language understanding, common sense reasoning, affective computing (reasoning about people's emotional states), and consciousness and issues of philosophy of mind. Can a computer be conscious? How would you know? All of these we've grouped together under natural interaction.
Big idea five is societal impact, the idea that AI can impact society in both positive and negative ways. We're already seeing that today. One aspect of that is the ethics of making decisions about people. You can make an AI system and trivially train it to make decisions about people, but will they be good decisions? Will they be fair? Will there be unintended consequences? That's a much harder thing to get right. It's not even easy to know if you've gotten it right because there are multiple definitions of fairness. Even if you want to be fair, it may not be clear how you define fair. This is one aspect of making decisions about people.
Another aspect is the economic impacts of AI. Increased productivity is a good thing, and new types of services become possible, but there is also a reduction in some types of jobs, hopefully balanced by new career opportunities. One of my students told me yesterday that people who can claim they can do prompt engineering have salaries in the $300,000 range right now. So there are new opportunities coming. If you're training your students to think about careers, they should be thinking about, "How am I going to use AI in my career?"
The third aspect of societal impact is how AI is affecting our culture. Consider things like deep fakes and Snapchat filters. I love this picture here: a self-driving car with a girl on her phone, a toddler in the back staring at the ceiling, and the only one watching the road is the dog, which can't drive. Would you put your child in a self-driving car? At what age? At what age do you think it's okay to put your kid in a self-driving car? Obviously, when they're 18, you can't control it anymore, right? But is 12 old enough? What about 10, six? From my mind, I'm thinking, "What would I want to put in a self-driving car that would make parents comfortable to put their kids in there?" You could imagine monitoring systems, safety systems. Maybe the car won't unlock unless there's an adult at the other end who the car recognizes, and it'll release the child to that person.
Thank you for your attention and for considering these important ideas. I hope this discussion has provided valuable insights into the current and future impact of AI on our world.
There are new kinds of services you can imagine because of these AI technologies, and figuring out what our culture's going to look like, that's an interesting thing that we all get to participate in. Right now, we all get to watch YouTube videos of people sleeping in their Teslas at 70 miles an hour. And, you know, that's also kind of an interesting part of our culture.
Okay, so we want to teach students about AI and help them understand these big ideas. How are we going to do that? We published a set of grade band progression charts. There are five of these progression charts, one for each of the five big ideas. And we cover four grade bands, just like the CSTA computing standards. So the grade bands are K-two, three-five, six-eight, and nine-12. These are published on the ai4k12.org website.
So this little inset here is one page of the big idea one grade band progression chart. I'm not going to go through all the detail, I just want to give you a little taste of what this is like. So we have the big idea, its name is perception. We have the one sentence statement of the big idea, "Computers perceive the world using sensors." And then we have these little crucial insights. These are the orange boxes. So the first one you've already seen, "Perception is the extraction of meaning from sensory information using knowledge." It's sort of this crucial understanding about perception. And then the second one here may be less familiar to you. "The transformation from signal to meaning takes place in stages, with increasingly abstract features and higher level knowledge applied at each stage."
Now, how do you explain that to somebody in K-12? Well, every column of the progression chart is one of these grade bands, and every row is a major concept or sub-concept. So if we just zoom in, we're going to look at just one row. So this is row 1-B-iv, the abstraction pipeline for vision. And so in grades three to five, we have something just talking about occlusion of objects. So if you look at these shapes here, if I ask you, "What is this red thing?" right, you might tell me that it's a circle, but it's not, right? It's not a perfect circle. There's a green chunk missing because it's being occluded by the green triangle. So getting kids to think about the perception of objects and what occlusion means 'cause computers have to deal with occlusion when they're looking at real world scenes.
So that's grade three-five. Grade six-eight, we can talk to them about edge detectors and the progression from image to meaning taking place in stages. When they get up to nine-12, we can do more. This is an actual neural network face detection network. So you have a color image over here, you run this through the neural network, and out pops a little box saying, "I see a face." And here's where the face is in the image. And these things over here, these are actually edge detectors and blob detectors. And this is an online demo. You can run this and you can actually play with the neural network. You can look at pieces of the neural network. We have experiments that you can do. And so this is how we teach computer vision at the nine-12 level, by having people do hands-on experimentation.
Okay, so I said this was a neural network. And that brings us to question two. What is a neural network? So a neural network is a large, complex mathematical function that maps inputs to outputs. So the input is some collection of numbers. The outputs are also a collection of numbers. And this function is composed of many little simple functions which people call units or sometimes they call them neurons. So here's an example of a neural network. We have a bunch of numbers on the input, we have these processing units here, and we have some numbers on the output. So each unit takes multiple numbers as input and produces one number as output.
Okay, so let's take a look at what a single neuron can do.
Thank you for your attention and interest in this topic. We hope this information helps you understand the exciting possibilities AI brings to education and culture.
If you have any questions or need further information, please feel free to reach out.
This is a one neuron neural network. And it's asking the question,
"Does John get to eat dessert tonight?"
Actually, it's answering that question. So he gets to eat dessert if it's a healthy dessert like fruit, or if it's not healthy, if it's cake or ice cream or something, he gets dessert if he ate all his vegetables, finished his milk, and cleared the table. So each of these inputs is going to be zero or one for true or false. We assign a weight to the inputs. We multiply each input by the weight and we sum them up. And if that sum is greater than two, then John gets to eat dessert. If it's not greater than two, he does not. And this little truth table here shows some of the cases. Here's the sums and here are the outputs. So one if he gets to eat dessert, zero if he does not.
So this is called a linear threshold unit and it's just one neuron. And you see it's able to do some fairly complex reasoning. So what happens when you put a whole bunch of these things together? Well, they're able to do quite a bit more. So neural networks are organized in layers. We have an input layer, an output layer, and in between, one or more hidden layers. And deep neural networks, they're just neural networks with lots of layers. So things like ChatGPT is a deep neural network.
The important thing about these networks is that you don't have to set the weights by hand. So in the John gets to eat dessert example, I programmed the weights by hand. But as these networks get larger, it's not possible for human beings to program the weights by hand. So the crucial thing is that there's an algorithm, there's a machine learning algorithm, remember, big idea three, machine learning, there's a machine learning algorithm that will automatically adjust the weights. So what you do is you show it an input and a desired output. So you show for this input, "This is the output you're supposed to produce," and the network will produce some actual output. You compare the actual output against the desired output. If they match, you're good. If they don't match, you generate an error signal, and that error signal is sent backwards through the network, that's why the algorithm's called back propagation learning, and it's used to adjust the weights. And so you make little adjustments to the weights. Do this a few trillion times and you end up with ChatGPT.
Okay, so there's some hype about neural networks I want to warn you about. People will say that neural networks work the way the brain does. Anyone who tells you that, walk away. That's absolutely not true. And I say this as someone who has personally done surgeries on rats, inserted electrodes in their brains, and recorded from brain cells. Neural networks do not work the way brains do. They were inspired by how we think brains might work, but we don't know how brains perceive, reason, remember things. We don't know in enough detail to make use of that knowledge to build things. So we don't build neural networks the way the brain works 'cause we don't know how the brain works. Don't buy the hype. It's okay that we don't know yet. We'll know someday, but we don't know yet. So don't tell your kids that neural nets work the way the brains work. They're inspired by brains the same way that a 747 was inspired by a bird. Yeah, they both have wings, right? But there's no feathers on my 747 that I'm flying in.
Okay, so that's the second question. You have three more questions to go, so I'll go a little faster.
What are foundation models? So foundation models are large AI models, typically neural network models, that have been trained on massive amounts of data, terabytes of text or millions of images. And once they've been trained up using these machine learning algorithms, they can act as a foundation on which you can build specialized reasoners to solve different kinds of problems like chatting for example. So there are different kinds of foundation models.
GPT-4 or Lambda, these are text-to-image models. DALL-E 2 and Stable Diffusion, these are, sorry, GPT-4 and Lambda are text-to-text models, and DALL-E 2 and Stable Diffusion are text-to-image models. So there's a whole bunch of different kinds of foundational models. Lambda, by the way, that's the Google large language model that a Google engineer was claiming had become sentient. He tried to hire a lawyer to defend it and got fired for his efforts. So yeah, it's a different world right now.
ChatGPT is an example of what's called a large language model. It's a text-to-text model. These large language models are trained on an unimaginably large dataset, like all of Wikipedia, right? The entirety of English Wikipedia, plus maybe 80,000 or so books, plus a good chunk of Reddit. This is more than any human being could read in a lifetime. All the big AI companies are building these large language models now. It's not just OpenAI; everybody's doing it.
Here's how you train a model like GPT-3. What it's asked to do primarily is just predict the next word in a sentence. So if you give it part of a sentence, like, "Since he was out of milk on the way home from work, John what?" it predicts several possible words that might be the next word: stopped, dropped, bought. Suppose that the correct next word is dropped. So we put that into the input here and now we ask it again. "Since he was out of milk on the way home from work, John dropped, what's the next word?" Again, it makes some set of predictions. At every step, you reward it for good predictions, you make it change the weights for incorrect predictions, and you just keep doing this. Once you've trained this thing up, now you can get it to answer questions.
If you ask it, "Where do eagles live?" because it's been trained on a question-answer format, what it predicts is the first word of the answer. The first word is going to be eagles. So you take that and you stick that back in and you ask it to predict the next word of the answer and so on, and so it ends up outputting. This is actually the first word of the first sentence of the actual answer I got from ChatGPT when I asked it, "Where do eagles live?" So it's just predicting one word at a time.
Now, some people have said that because it's just predicting one word at a time, this thing is just a fancy autocomplete and it's not doing anything interesting. That's not true. Early language models, earlier as in last year, could only make sort of crude statistical predictions based on the small amount of data they were trained on. But what happened that nobody expected is when these things got large enough, so the network itself, the neural network itself, got large enough, enough units, enough layers, and the dataset that it was trained on got large enough, there was this kind of phase change that happened that nobody was anticipating. Suddenly, this thing started reasoning. It's not human-like reasoning, but it's definitely reasoning. It's showing levels of understanding and reasoning abilities and problem-solving abilities that nobody anticipated.
That's why this is such a big deal. We've only had these models for a couple of years now, and the public only found out about it in November when OpenAI went public with ChatGPT. They've been around for a couple of years. Google had a model that they said was better, but Google was too afraid of releasing this thing because there are all kinds of bad things that can happen if you let people play with this, right?
And in fact, when Microsoft went public with their version, so it was called Bing connected to GPT, some New York Times reporter got into this dialogue with this thing, and it was telling him to leave his wife, and, you know, it was in love with him, and, you know, all kinds of psychosis. If you're of the opinion that all publicity is good publicity, this is great, right? "Look, our evil computer's trying to get this guy to leave his wife, and by the way, our name is OpenAI," and, you know. But if you're Google, the reputational risk is huge. They were rightly terrified that if they opened this thing up to the public, crazy bad things would happen. So OpenAI just came in and ate their lunch, and they got all the publicity for a technology that Google had actually invented. ChatGPT is based on transformers, which were developed at Google. Sometimes being timid will cost you.
All right, so how does it really work? That's what you are probably wondering, right? How does it really work? We don't know. It's not just I don't know. The people who build these things will tell you that they don't know either. There's a whole new area of research trying to figure out what's going on inside the head of these things. There'll be many, many PhD theses written about that as we try to puzzle out how these things work. It's not like we don't know anything; there are theories about it, but it's very complicated. I'm not going to try and get into it right now, but I'm just going to show you a few pretty pictures.
So this is called a transformer architecture. This was invented by Google. It's behind all the large language models that are popular right now, like ChatGPT, GPT-4, Google's own Bard, and so on. Just think of this as a gigantic neural network, really a really big neural network. It uses something called self-attention. That was sort of the magic trick that makes transformer networks better than just ordinary plain vanilla neural networks. Just to give you a picture of what this looks like, this self-attention here is a pretty interesting computational process if you just do it once. In GPT-3, they do it 96 times. There's layer upon layer of self-attention. You're up to 96 layers of self-attention, and this thing has 175 billion parameters.
Remember our one neuron that was deciding if John got to eat dessert or not? That neuron had six parameters. It had five inputs and a threshold, so it had six parameters. This thing has 175 billion parameters, so it's not surprising that it's able to do a lot of really surprising stuff that we don't know how to program it. You can't program ChatGPT. We don't have algorithms to do what this thing does. Machine learning invented an algorithm to make it do this, and we're trying to figure out what that algorithm is.
Okay, oh, and by the way, training these things is mind-numbingly expensive, right? You're talking tens of millions of dollars to train this thing up. Huge data center. Some of these, even the smaller models, are quite demanding of resources. The largest models are so large that people are starting to worry about the ecological effects of building all of these data centers. All of them take power. They take water for cooling. People are starting to worry that maybe we're putting all our resources into these giant data centers.
All right, I just have a couple more questions I want to cover with you quickly. How will AI technologies change robotics education? Well, we're going to have robots that can see much better. We can have robots that understand and use complex language, and we can have robots that are good at planning and complex reasoning. Now, we have robots today that do all this stuff, but they're programmed by PhDs.
Thank you for your attention and interest in this topic. I hope this has provided some insight into the fascinating world of AI and its implications.
And the idea is with these foundation models, anybody will be able to get a robot to do these things, even for new tasks. You come up with some new task or you have some new kind of robot, you'll be able to make the robot adapt to your task very quickly because of these foundation models that are really changing the game.
So I want to show you a little bit about what robot programming can look like when AI is involved. So full disclaimer, Calypso is a commercial product from me. So I made this thing, I sell it. At the moment, you can run it for free on the cloud version, but originally I developed it for a robot called the Cozmo robot. This is a robot intelligence framework. The elevator pitch is PhD level robot programming done by eight-year-olds. We had built-in computer vision, speech recognition, landmark-based navigation, path planning, and object manipulation. It has these little cubes that it can manipulate. And a rule-based language, and it teaches computational thinking. You can try it for free at calypso-robotics.com.
Here's a picture I took from my friend's Tesla. So we're driving down the highway, there's a truck in front of us over here, there's a car in the other lane, there's another car over here, and the Tesla is showing you its world map. So here's us and here's the truck ahead of us. And here are the other cars in the other lane. The Tesla has a world map. Remember, a Tesla is just the robot that you sit in, right? Self-driving cars are robots.
So here is the same thing in Calypso. We've got the little Cozmo robot and it's sitting on a tabletop, and here are a couple of its cubes that it knows how to manipulate. It can pick them up, it can roll them, and so on. Here's a little robot house that's got some markers on it. And here's the robot's world map showing you the cubes, the wall that it's looking at. The blue thing is the doorway. Here's the rule-based language so we can instruct the robot. It's got built-in speech recognition so if you say something, it recognizes that, it can respond to your speech, and it can also speak itself. So this is a real thing. You can run this software today. If you have a Cozmo robot, you can do this today.
But how is this going to change? Well, foundation models are bringing new capabilities that we didn't imagine before. Depth maps. So from a single image, we can now get the depth of that image. You can do that 'cause you're smart, you have a big brain, but robots used to need stereo cameras or structured light like the Microsoft Kinect or LiDAR. You don't need that anymore. You just have this gigantic neural network that was trained to understand scenes, and we're able to map single images to depth. We can have object recognition and scene understanding automatically from these foundation models. We can turn natural language instructions into code. People are already using ChatGPT to generate code. We'll be able to do that for our robots as well. And lots more things to come.
Okay, so last big question, what can I do now to prepare to teach AI? Well, we have a lot of resources at ai4k12.org. We have a curated resource directory there, so we've gone through and found the best stuff that's appropriate for K-12 including books, videos, demonstration software, and curriculum materials that other people have developed. So you can go through this material. You can join our mailing list. There's a link at the bottom of the ai4k12.org website. You can try my cloud Calypso for free. And it doesn't cost much if you want an institutional license. It's really cheap. I'm a much better computer scientist than I am a businessman, so I'm not getting rich off this thing, I'll tell you that.
Keep an eye out for exciting new robots coming soon from K-12 from VEX. VEX people are going to have a really interesting announcement for you soon.
Okay, so here's our website again, ai4k12.org.
We would love to have you join the mailing list. There are lots of K-12 educators on there as well as AI education researchers and AI researchers. So it's a really cool community. We'd love to have you join. And thank you.
(audience applauding)
So we're going to go ahead and pause for questions, but I think if you go to the previous slide, Dave, there was still some folks, yeah, trying to take some pictures. I know there are probably a lot of questions, and Dave is going to be gracious enough with his time to answer some questions. So who wants to be first? Come up to the microphone, please, or tell us to bring you the microphone, even better.
Hello, so I was thinking about something from the past. I tried to be an editor of a topic on Wikipedia that I was an expert on and continue to be, and when Wikipedia first came out, it was looking for editors because you could be curating the information correctly, right? But I wasn't allowed to be an editor. They stopped me and I don't know why. But I wondered about garbage in, garbage out. So if I go to that page today, I haven't done for a while, but if there's information that these have based that topic on, and it's incorrect, then is it self-correcting? If I go incorrect it on Wikipedia, does it go back and take that information back into the model? I have so many other questions, but that was just one that I thought, the garbage in, garbage out. How do we contend with that?
Yeah, that's a great question. So there's a lot of issues surrounding the choice of training data for these models. Because they're very expensive to train, they don't train from start very often. You know, you're talking tens of millions of dollars. And so people are worried that if you just collect everything, so especially things like Reddit, which is full of all kinds of craziness, but even Wikipedia, there's a danger that the model may reflect biases in the data or incorrect information that would be undesirable. And so people are thinking about, you know, what can they do to curate these datasets? But these data sets are so large that it's a hard question, right? You can't really manually review everything, you know, and still have the size of dataset that you need. So I don't have a simple answer for that.
One thing people are doing is hooking these things up to a web search engine. So ChatGPT was just trained on a dataset that was created in 2021. So it doesn't know anything about more recent developments. But there are other large language models that are interfaced now to web search, and so they can go look on the web and get the latest information. So presumably, if Wikipedia changes an article, then eventually these models will pick up on that.
Also, you know, there's been this problem with hallucination. So you can ask ChatGPT about something, and oftentimes it will just make stuff up. So if it knows the answer, it'll probably tell you the answer. But if it doesn't know the answer, they may very well just confabulate and make stuff up. And people are building in defenses against that by having it do a fact check. So I think the Bing version, when you ask it a question and it thinks it knows the answer, it then does a web search to verify that the answer's correct. So I think it'll get better.
But, you know, ultimately, there's this danger in the centralization of knowledge, right? So, you know, Wikipedia, their goal is to reflect all views, right? To have well thought out, balanced articles. Wikipedia also has a lot of political infighting going on behind the scenes. So like you said, some people don't get to be editors who should be editors. And as things get more centralized, that's more of a problem. So I don't know what the right answer is there.
Question over here, Dave, to your right. Yes, hi. Hi.
So of course when ChatGPT came out, all of the kids were very surreptitiously like, "Oh no, I'm not using ChatGPT for my history essay." And all of the English teachers are like, "This is the devil, what are we going to do?"
As the engineering robotics teacher, I was thinking, well, what we need to figure out is how do I make my kids understand how to use this as a tool to get the best possible output? And so I was wondering if you had any thoughts on ways of training the kids or teaching the kids how do we use this as a tool and just see this as a tool, and the human is still the most important part of the equation. We're not replacing the human, it's like automation in our factories. We're not replacing the human, we're giving the human a different role and teaching them how to use a tool. Do you have any thoughts on that?
Yeah, so I think there are lots of wonderful ways that we can use these large language models for education. Everybody can have a personal tutor, for example. So here's something that we stumbled upon. I have another project called AI for Georgia. We're developing a middle school AI elective for Georgia middle school students. One of the things we came up with was this idea of having them design a robot. We're teaching them about computer perception, how self-driving cars work, and how different kinds of sensors work.
In this multi-stage process of designing a robot, one of the things we asked them to do is to figure out, first of all, what's your robot going to do? We gave an example of a robot whose job is to mop up the floor in the cafeteria. Okay, so you have some tasks you want your robot to do. Now, what kinds of sensors will your robot need in order to accomplish this task? Since we've taught them about different kinds of sensors, they were able to come up with some answers. Then, how will your robot interact with people and so on, right? There's this whole series of design choices that we ask them to make as they think through what their robot is going to look like and how it's going to function.
So I got this idea. I said to ChatGPT, "Pretend that you are a robot whose job is to mop up floors and have a discussion with me." Then I asked it, "What sensors do you have?" It came back with this very reasonable discussion about the kinds of sensors that a floor mopping robot would want to have. We're going to have kids do this. As they imagine their robot, we're going to have them say, "Well, let's make ChatGPT play the role of the robot that they're designing and let it give them more insights, or let them test their ideas, right?" They could say to ChatGPT, like, "Do you have a radar?" And it might say, "Well, you know, radar doesn't really do much indoors. I think camera input is going to be more useful for my task," for example. It's actually quite sophisticated in discussing its life as a hypothetical floor mopping robot.
I think you're going to see things like this all over the place, right? In history class, suppose you'd like to have a conversation with George Washington, right? ChatGPT can roleplay that. The ways that we're going to put this to use, we're still figuring that out, but they're so positive, they're so strong, that worrying about some kid getting ChatGPT to write their essay for them, that's not going to be a significant worry. We'll find ways to deal with that. But the positive payback is so huge, it's really exciting.
Any other questions? Any other questions? No? Okay. Round of applause.
(audience applauding)
(warm music)
Thank you all for your attention and participation. It has been a pleasure discussing these exciting developments with you.
                                                    I've obviously looked forward to every presentation that we've had during the conference. It's pretty much been unanimous that everyone has joined every presentation that we've had during the conference. But, you know, if I had to say, this is probably one of the presentations I've been most looking forward to, and that is not just because I've been playing nonstop on ChatGPT for the last three weeks. But I think obviously as educators, artificial intelligence is something that we have a lot of questions about. AI is something that we might either be worried about, anticipating, can't wait for, all of the above.
But I'm very proud to say that in my humble opinion we have here at our conference this afternoon Professor David Touretzky, who is the foremost expert on artificial intelligence, specifically as it pertains to K-12 education. So it's my pleasure to introduce you to Professor David Touretzky from Carnegie Mellon University.
(audience applauding)
Thank you, Jason. It's a pleasure to be here to talk to you today. Maybe the mic gain can be turned down just a little bit. So how many people here have not played with ChatGPT? So more than I thought, but the rest of you have, right? Okay, so we're living in an amazing time. ChatGPT, these large language models, they're going to change everything and people are aware of that. They're excited by it. They don't yet know exactly how things are changing, but we know that they're changing. It's kind of like standing next to Alexander Graham Bell when he got the telephone to work, or next to Marconi when he made the first successful wireless transmission. We are living in a historic time, and that's leading to AI anxiety. There was a piece in the Wall Street Journal last week about AI grief. But there's also tremendous excitement, and so I want to focus on the positive today and talk to you about teaching AI in K-12.
And I want to address five questions. What is artificial intelligence? What are deep neural networks? What are foundation models? How will robotics education be changed by these technologies? And what can I do now to prepare? So let's just jump into it here. Your students may ask you, "What is artificial intelligence?" And here's an answer that you can give them. Artificial intelligence is, first of all, it's a branch of computer science. It is concerned with techniques that allow computers to do things, okay? So psychology is concerned with how people think. Artificial intelligence is concerned with how to get computers to do things, specifically things that when people do them are considered evidence of intelligence. So that's the definition of AI that we're going to work with.
We're already surrounded by AI technologies. You use AI every day, speech recognition when you talk to Alexa, automated subtitles on videos, that's AI doing speech recognition behind the scenes. Computer vision. You may not be brave enough to sit in the back of your Tesla while it's driving, but many of your cars have automated lane departure warnings, for example. That's computer vision. You unlock your phone with your face. That's computer vision. Language understanding, if you ask Google, "What's the second largest city in Honduras?" and it gives you the answer, there's artificial intelligence behind that. Google Translate, you can translate between any of 112 languages. My favorite Chinese restaurant had an insert in my takeout that was written in Chinese. And I can't read Chinese, so I just pointed my phone at it and it translated it to English for me. That's AI. All these recommender systems when Amazon's trying to sell you more stuff, Netflix is trying to come up with movies for you to watch, all that's based on their model of what they think you're interested in. That's AI.
And of course, in robots, many people have Roombas cleaning their houses. Amazon's got these Kiva robots in their warehouses. All of these things are powered by AI. So we're already living in an AI world, but it's about to get a lot more wild.
AI does pose challenges for society. Many types of work are going to become automated. Some types of jobs will be eliminated. More common will be that just the way people do those jobs will change. A lot of people will be able to be more effective, more productive because AI will be assisting them.
Other places where things can go wrong include people using AI systems to make decisions about people, such as whose resume gets reviewed, who gets a mortgage, and who gets admitted to prestigious schools. If AI systems are making these decisions that affect people's lives, there are questions about whether they are biased unfairly, if they are transparent, and if somebody is taking accountability for what these systems do. These are ethical questions that have to be addressed by our society.
AI is powering the surveillance state. China's a leader in this area, with face recognition cameras on all the street corners. But the US is also in danger of this kind of surveillance state mentality, and we have to decide if we're going to live with that or not. Then there are things like deep fakes that are undermining our confidence in audio and video recordings as being truthful. So it's not all good; AI is both good and bad, but these are issues that are here today, right now.
I'm the founder and chair of ai4k12.org. We were funded by NSF, and our mission was to establish national guidelines for teaching AI in K-12. So how do you teach AI in K-12? There are computer science standards that CSTA, the Computer Science Teachers Association, published. The most recent version was in 2017, and the entire document contained just two sentences about AI. They just didn't address it. So we came along and said, "Let's remedy this."
Our work was co-sponsored by CSTA and by AAAI, the Association for the Advancement of Artificial Intelligence. We set to work trying to figure out what K-12 students should know about AI and what they should be able to do with it. One of the first things we did was release a list of five big ideas in AI, as you can see on the slide here, and a little infographic. This is all now available on a poster, which has been translated into 16 languages because there's so much interest in teaching AI at the K-12 level.
I want to take you through the five big ideas very quickly.
Music Cue
Big idea one: perception. Computers perceive the world using sensors. We make a distinction between sensation and perception. Sensation is the raw signal, while perception is extracting meaning from the sensory signal. Here you see how a self-driving car sees the world. The camera does the sensing, recording the image, but the perception—seeing pedestrians, vehicles, traffic lights, and street signs—is where the AI comes in. That's the perception part. Is it possible to have sensing without perception? Absolutely. That's why you get deer in the supermarket, because you've got this automated door with a sensor, but there's no perception. If there was perception, you wouldn't let the deer in. But without perception, the deer can come in.
Big idea two is representation and reasoning. Agents maintain representations of the world and use them for reasoning. If you teach computing, this is analogous to having data structures and algorithms. The representations are the data structures, and reasoning is the algorithm. The illustration here is from the famous Go match between Lee Sedol, the Go Grand Master, and Google's DeepMind. DeepMind beat the best Go player on the planet a few years ago. Obviously, DeepMind had some representation of the board, and the reasoning algorithm was how you choose the next move.
Music Cue
Thank you for your attention and interest in AI education. We hope to inspire the next generation to understand and innovate in the field of artificial intelligence.
Big idea three is learning that computers can learn from data. This is what's really powered the AI revolution that's taken place over the last 10 to 15 years. Machine learning suddenly became very powerful, and when applied to other AI problems, it made a lot of other parts of AI work much better than before. For example, speech recognition has been a focus since the 1960s, but it didn't work very well until machine learning techniques became powerful enough to make it effective. The same is true for computer vision, language understanding, and many other areas. Some people conflate machine learning with AI, but they're not the same thing. AI started in the 1950s, while machine learning really began in the 1980s as a sub-discipline of AI. Then, in the 2010s, deep neural networks emerged as a particular approach to machine learning, making things like speech recognition work. The fact that you can talk to Siri now is because of this progress in the field powered by machine learning.
Big idea four is natural interaction. Intelligent agents require many kinds of knowledge to interact naturally with humans. I'll be honest, this is kind of the big carpet under which we swept a lot of stuff because we wanted to have five big ideas. This includes natural language understanding, common sense reasoning, affective computing (reasoning about people's emotional states), and consciousness and issues of philosophy of mind. Can a computer be conscious? How would you know? All of these we've grouped together under natural interaction.
Big idea five is societal impact, the idea that AI can impact society in both positive and negative ways. We're already seeing that today. One aspect of that is the ethics of making decisions about people. You can make an AI system and trivially train it to make decisions about people, but will they be good decisions? Will they be fair? Will there be unintended consequences? That's a much harder thing to get right. It's not even easy to know if you've gotten it right because there are multiple definitions of fairness. Even if you want to be fair, it may not be clear how you define fair. This is one aspect of making decisions about people.
Another aspect is the economic impacts of AI. Increased productivity is a good thing, and new types of services become possible, but there is also a reduction in some types of jobs, hopefully balanced by new career opportunities. One of my students told me yesterday that people who can claim they can do prompt engineering have salaries in the $300,000 range right now. So there are new opportunities coming. If you're training your students to think about careers, they should be thinking about, "How am I going to use AI in my career?"
The third aspect of societal impact is how AI is affecting our culture. Consider things like deep fakes and Snapchat filters. I love this picture here: a self-driving car with a girl on her phone, a toddler in the back staring at the ceiling, and the only one watching the road is the dog, which can't drive. Would you put your child in a self-driving car? At what age? At what age do you think it's okay to put your kid in a self-driving car? Obviously, when they're 18, you can't control it anymore, right? But is 12 old enough? What about 10, six? From my mind, I'm thinking, "What would I want to put in a self-driving car that would make parents comfortable to put their kids in there?" You could imagine monitoring systems, safety systems. Maybe the car won't unlock unless there's an adult at the other end who the car recognizes, and it'll release the child to that person.
Thank you for your attention and for considering these important ideas. I hope this discussion has provided valuable insights into the current and future impact of AI on our world.
There are new kinds of services you can imagine because of these AI technologies, and figuring out what our culture's going to look like, that's an interesting thing that we all get to participate in. Right now, we all get to watch YouTube videos of people sleeping in their Teslas at 70 miles an hour. And, you know, that's also kind of an interesting part of our culture.
Okay, so we want to teach students about AI and help them understand these big ideas. How are we going to do that? We published a set of grade band progression charts. There are five of these progression charts, one for each of the five big ideas. And we cover four grade bands, just like the CSTA computing standards. So the grade bands are K-two, three-five, six-eight, and nine-12. These are published on the ai4k12.org website.
So this little inset here is one page of the big idea one grade band progression chart. I'm not going to go through all the detail, I just want to give you a little taste of what this is like. So we have the big idea, its name is perception. We have the one sentence statement of the big idea, "Computers perceive the world using sensors." And then we have these little crucial insights. These are the orange boxes. So the first one you've already seen, "Perception is the extraction of meaning from sensory information using knowledge." It's sort of this crucial understanding about perception. And then the second one here may be less familiar to you. "The transformation from signal to meaning takes place in stages, with increasingly abstract features and higher level knowledge applied at each stage."
Now, how do you explain that to somebody in K-12? Well, every column of the progression chart is one of these grade bands, and every row is a major concept or sub-concept. So if we just zoom in, we're going to look at just one row. So this is row 1-B-iv, the abstraction pipeline for vision. And so in grades three to five, we have something just talking about occlusion of objects. So if you look at these shapes here, if I ask you, "What is this red thing?" right, you might tell me that it's a circle, but it's not, right? It's not a perfect circle. There's a green chunk missing because it's being occluded by the green triangle. So getting kids to think about the perception of objects and what occlusion means 'cause computers have to deal with occlusion when they're looking at real world scenes.
So that's grade three-five. Grade six-eight, we can talk to them about edge detectors and the progression from image to meaning taking place in stages. When they get up to nine-12, we can do more. This is an actual neural network face detection network. So you have a color image over here, you run this through the neural network, and out pops a little box saying, "I see a face." And here's where the face is in the image. And these things over here, these are actually edge detectors and blob detectors. And this is an online demo. You can run this and you can actually play with the neural network. You can look at pieces of the neural network. We have experiments that you can do. And so this is how we teach computer vision at the nine-12 level, by having people do hands-on experimentation.
Okay, so I said this was a neural network. And that brings us to question two. What is a neural network? So a neural network is a large, complex mathematical function that maps inputs to outputs. So the input is some collection of numbers. The outputs are also a collection of numbers. And this function is composed of many little simple functions which people call units or sometimes they call them neurons. So here's an example of a neural network. We have a bunch of numbers on the input, we have these processing units here, and we have some numbers on the output. So each unit takes multiple numbers as input and produces one number as output.
Okay, so let's take a look at what a single neuron can do.
Thank you for your attention and interest in this topic. We hope this information helps you understand the exciting possibilities AI brings to education and culture.
If you have any questions or need further information, please feel free to reach out.
This is a one neuron neural network. And it's asking the question,
"Does John get to eat dessert tonight?"
Actually, it's answering that question. So he gets to eat dessert if it's a healthy dessert like fruit, or if it's not healthy, if it's cake or ice cream or something, he gets dessert if he ate all his vegetables, finished his milk, and cleared the table. So each of these inputs is going to be zero or one for true or false. We assign a weight to the inputs. We multiply each input by the weight and we sum them up. And if that sum is greater than two, then John gets to eat dessert. If it's not greater than two, he does not. And this little truth table here shows some of the cases. Here's the sums and here are the outputs. So one if he gets to eat dessert, zero if he does not.
So this is called a linear threshold unit and it's just one neuron. And you see it's able to do some fairly complex reasoning. So what happens when you put a whole bunch of these things together? Well, they're able to do quite a bit more. So neural networks are organized in layers. We have an input layer, an output layer, and in between, one or more hidden layers. And deep neural networks, they're just neural networks with lots of layers. So things like ChatGPT is a deep neural network.
The important thing about these networks is that you don't have to set the weights by hand. So in the John gets to eat dessert example, I programmed the weights by hand. But as these networks get larger, it's not possible for human beings to program the weights by hand. So the crucial thing is that there's an algorithm, there's a machine learning algorithm, remember, big idea three, machine learning, there's a machine learning algorithm that will automatically adjust the weights. So what you do is you show it an input and a desired output. So you show for this input, "This is the output you're supposed to produce," and the network will produce some actual output. You compare the actual output against the desired output. If they match, you're good. If they don't match, you generate an error signal, and that error signal is sent backwards through the network, that's why the algorithm's called back propagation learning, and it's used to adjust the weights. And so you make little adjustments to the weights. Do this a few trillion times and you end up with ChatGPT.
Okay, so there's some hype about neural networks I want to warn you about. People will say that neural networks work the way the brain does. Anyone who tells you that, walk away. That's absolutely not true. And I say this as someone who has personally done surgeries on rats, inserted electrodes in their brains, and recorded from brain cells. Neural networks do not work the way brains do. They were inspired by how we think brains might work, but we don't know how brains perceive, reason, remember things. We don't know in enough detail to make use of that knowledge to build things. So we don't build neural networks the way the brain works 'cause we don't know how the brain works. Don't buy the hype. It's okay that we don't know yet. We'll know someday, but we don't know yet. So don't tell your kids that neural nets work the way the brains work. They're inspired by brains the same way that a 747 was inspired by a bird. Yeah, they both have wings, right? But there's no feathers on my 747 that I'm flying in.
Okay, so that's the second question. You have three more questions to go, so I'll go a little faster.
What are foundation models? So foundation models are large AI models, typically neural network models, that have been trained on massive amounts of data, terabytes of text or millions of images. And once they've been trained up using these machine learning algorithms, they can act as a foundation on which you can build specialized reasoners to solve different kinds of problems like chatting for example. So there are different kinds of foundation models.
GPT-4 or Lambda, these are text-to-image models. DALL-E 2 and Stable Diffusion, these are, sorry, GPT-4 and Lambda are text-to-text models, and DALL-E 2 and Stable Diffusion are text-to-image models. So there's a whole bunch of different kinds of foundational models. Lambda, by the way, that's the Google large language model that a Google engineer was claiming had become sentient. He tried to hire a lawyer to defend it and got fired for his efforts. So yeah, it's a different world right now.
ChatGPT is an example of what's called a large language model. It's a text-to-text model. These large language models are trained on an unimaginably large dataset, like all of Wikipedia, right? The entirety of English Wikipedia, plus maybe 80,000 or so books, plus a good chunk of Reddit. This is more than any human being could read in a lifetime. All the big AI companies are building these large language models now. It's not just OpenAI; everybody's doing it.
Here's how you train a model like GPT-3. What it's asked to do primarily is just predict the next word in a sentence. So if you give it part of a sentence, like, "Since he was out of milk on the way home from work, John what?" it predicts several possible words that might be the next word: stopped, dropped, bought. Suppose that the correct next word is dropped. So we put that into the input here and now we ask it again. "Since he was out of milk on the way home from work, John dropped, what's the next word?" Again, it makes some set of predictions. At every step, you reward it for good predictions, you make it change the weights for incorrect predictions, and you just keep doing this. Once you've trained this thing up, now you can get it to answer questions.
If you ask it, "Where do eagles live?" because it's been trained on a question-answer format, what it predicts is the first word of the answer. The first word is going to be eagles. So you take that and you stick that back in and you ask it to predict the next word of the answer and so on, and so it ends up outputting. This is actually the first word of the first sentence of the actual answer I got from ChatGPT when I asked it, "Where do eagles live?" So it's just predicting one word at a time.
Now, some people have said that because it's just predicting one word at a time, this thing is just a fancy autocomplete and it's not doing anything interesting. That's not true. Early language models, earlier as in last year, could only make sort of crude statistical predictions based on the small amount of data they were trained on. But what happened that nobody expected is when these things got large enough, so the network itself, the neural network itself, got large enough, enough units, enough layers, and the dataset that it was trained on got large enough, there was this kind of phase change that happened that nobody was anticipating. Suddenly, this thing started reasoning. It's not human-like reasoning, but it's definitely reasoning. It's showing levels of understanding and reasoning abilities and problem-solving abilities that nobody anticipated.
That's why this is such a big deal. We've only had these models for a couple of years now, and the public only found out about it in November when OpenAI went public with ChatGPT. They've been around for a couple of years. Google had a model that they said was better, but Google was too afraid of releasing this thing because there are all kinds of bad things that can happen if you let people play with this, right?
And in fact, when Microsoft went public with their version, so it was called Bing connected to GPT, some New York Times reporter got into this dialogue with this thing, and it was telling him to leave his wife, and, you know, it was in love with him, and, you know, all kinds of psychosis. If you're of the opinion that all publicity is good publicity, this is great, right? "Look, our evil computer's trying to get this guy to leave his wife, and by the way, our name is OpenAI," and, you know. But if you're Google, the reputational risk is huge. They were rightly terrified that if they opened this thing up to the public, crazy bad things would happen. So OpenAI just came in and ate their lunch, and they got all the publicity for a technology that Google had actually invented. ChatGPT is based on transformers, which were developed at Google. Sometimes being timid will cost you.
All right, so how does it really work? That's what you are probably wondering, right? How does it really work? We don't know. It's not just I don't know. The people who build these things will tell you that they don't know either. There's a whole new area of research trying to figure out what's going on inside the head of these things. There'll be many, many PhD theses written about that as we try to puzzle out how these things work. It's not like we don't know anything; there are theories about it, but it's very complicated. I'm not going to try and get into it right now, but I'm just going to show you a few pretty pictures.
So this is called a transformer architecture. This was invented by Google. It's behind all the large language models that are popular right now, like ChatGPT, GPT-4, Google's own Bard, and so on. Just think of this as a gigantic neural network, really a really big neural network. It uses something called self-attention. That was sort of the magic trick that makes transformer networks better than just ordinary plain vanilla neural networks. Just to give you a picture of what this looks like, this self-attention here is a pretty interesting computational process if you just do it once. In GPT-3, they do it 96 times. There's layer upon layer of self-attention. You're up to 96 layers of self-attention, and this thing has 175 billion parameters.
Remember our one neuron that was deciding if John got to eat dessert or not? That neuron had six parameters. It had five inputs and a threshold, so it had six parameters. This thing has 175 billion parameters, so it's not surprising that it's able to do a lot of really surprising stuff that we don't know how to program it. You can't program ChatGPT. We don't have algorithms to do what this thing does. Machine learning invented an algorithm to make it do this, and we're trying to figure out what that algorithm is.
Okay, oh, and by the way, training these things is mind-numbingly expensive, right? You're talking tens of millions of dollars to train this thing up. Huge data center. Some of these, even the smaller models, are quite demanding of resources. The largest models are so large that people are starting to worry about the ecological effects of building all of these data centers. All of them take power. They take water for cooling. People are starting to worry that maybe we're putting all our resources into these giant data centers.
All right, I just have a couple more questions I want to cover with you quickly. How will AI technologies change robotics education? Well, we're going to have robots that can see much better. We can have robots that understand and use complex language, and we can have robots that are good at planning and complex reasoning. Now, we have robots today that do all this stuff, but they're programmed by PhDs.
Thank you for your attention and interest in this topic. I hope this has provided some insight into the fascinating world of AI and its implications.
And the idea is with these foundation models, anybody will be able to get a robot to do these things, even for new tasks. You come up with some new task or you have some new kind of robot, you'll be able to make the robot adapt to your task very quickly because of these foundation models that are really changing the game.
So I want to show you a little bit about what robot programming can look like when AI is involved. So full disclaimer, Calypso is a commercial product from me. So I made this thing, I sell it. At the moment, you can run it for free on the cloud version, but originally I developed it for a robot called the Cozmo robot. This is a robot intelligence framework. The elevator pitch is PhD level robot programming done by eight-year-olds. We had built-in computer vision, speech recognition, landmark-based navigation, path planning, and object manipulation. It has these little cubes that it can manipulate. And a rule-based language, and it teaches computational thinking. You can try it for free at calypso-robotics.com.
Here's a picture I took from my friend's Tesla. So we're driving down the highway, there's a truck in front of us over here, there's a car in the other lane, there's another car over here, and the Tesla is showing you its world map. So here's us and here's the truck ahead of us. And here are the other cars in the other lane. The Tesla has a world map. Remember, a Tesla is just the robot that you sit in, right? Self-driving cars are robots.
So here is the same thing in Calypso. We've got the little Cozmo robot and it's sitting on a tabletop, and here are a couple of its cubes that it knows how to manipulate. It can pick them up, it can roll them, and so on. Here's a little robot house that's got some markers on it. And here's the robot's world map showing you the cubes, the wall that it's looking at. The blue thing is the doorway. Here's the rule-based language so we can instruct the robot. It's got built-in speech recognition so if you say something, it recognizes that, it can respond to your speech, and it can also speak itself. So this is a real thing. You can run this software today. If you have a Cozmo robot, you can do this today.
But how is this going to change? Well, foundation models are bringing new capabilities that we didn't imagine before. Depth maps. So from a single image, we can now get the depth of that image. You can do that 'cause you're smart, you have a big brain, but robots used to need stereo cameras or structured light like the Microsoft Kinect or LiDAR. You don't need that anymore. You just have this gigantic neural network that was trained to understand scenes, and we're able to map single images to depth. We can have object recognition and scene understanding automatically from these foundation models. We can turn natural language instructions into code. People are already using ChatGPT to generate code. We'll be able to do that for our robots as well. And lots more things to come.
Okay, so last big question, what can I do now to prepare to teach AI? Well, we have a lot of resources at ai4k12.org. We have a curated resource directory there, so we've gone through and found the best stuff that's appropriate for K-12 including books, videos, demonstration software, and curriculum materials that other people have developed. So you can go through this material. You can join our mailing list. There's a link at the bottom of the ai4k12.org website. You can try my cloud Calypso for free. And it doesn't cost much if you want an institutional license. It's really cheap. I'm a much better computer scientist than I am a businessman, so I'm not getting rich off this thing, I'll tell you that.
Keep an eye out for exciting new robots coming soon from K-12 from VEX. VEX people are going to have a really interesting announcement for you soon.
Okay, so here's our website again, ai4k12.org.
We would love to have you join the mailing list. There are lots of K-12 educators on there as well as AI education researchers and AI researchers. So it's a really cool community. We'd love to have you join. And thank you.
(audience applauding)
So we're going to go ahead and pause for questions, but I think if you go to the previous slide, Dave, there was still some folks, yeah, trying to take some pictures. I know there are probably a lot of questions, and Dave is going to be gracious enough with his time to answer some questions. So who wants to be first? Come up to the microphone, please, or tell us to bring you the microphone, even better.
Hello, so I was thinking about something from the past. I tried to be an editor of a topic on Wikipedia that I was an expert on and continue to be, and when Wikipedia first came out, it was looking for editors because you could be curating the information correctly, right? But I wasn't allowed to be an editor. They stopped me and I don't know why. But I wondered about garbage in, garbage out. So if I go to that page today, I haven't done for a while, but if there's information that these have based that topic on, and it's incorrect, then is it self-correcting? If I go incorrect it on Wikipedia, does it go back and take that information back into the model? I have so many other questions, but that was just one that I thought, the garbage in, garbage out. How do we contend with that?
Yeah, that's a great question. So there's a lot of issues surrounding the choice of training data for these models. Because they're very expensive to train, they don't train from start very often. You know, you're talking tens of millions of dollars. And so people are worried that if you just collect everything, so especially things like Reddit, which is full of all kinds of craziness, but even Wikipedia, there's a danger that the model may reflect biases in the data or incorrect information that would be undesirable. And so people are thinking about, you know, what can they do to curate these datasets? But these data sets are so large that it's a hard question, right? You can't really manually review everything, you know, and still have the size of dataset that you need. So I don't have a simple answer for that.
One thing people are doing is hooking these things up to a web search engine. So ChatGPT was just trained on a dataset that was created in 2021. So it doesn't know anything about more recent developments. But there are other large language models that are interfaced now to web search, and so they can go look on the web and get the latest information. So presumably, if Wikipedia changes an article, then eventually these models will pick up on that.
Also, you know, there's been this problem with hallucination. So you can ask ChatGPT about something, and oftentimes it will just make stuff up. So if it knows the answer, it'll probably tell you the answer. But if it doesn't know the answer, they may very well just confabulate and make stuff up. And people are building in defenses against that by having it do a fact check. So I think the Bing version, when you ask it a question and it thinks it knows the answer, it then does a web search to verify that the answer's correct. So I think it'll get better.
But, you know, ultimately, there's this danger in the centralization of knowledge, right? So, you know, Wikipedia, their goal is to reflect all views, right? To have well thought out, balanced articles. Wikipedia also has a lot of political infighting going on behind the scenes. So like you said, some people don't get to be editors who should be editors. And as things get more centralized, that's more of a problem. So I don't know what the right answer is there.
Question over here, Dave, to your right. Yes, hi. Hi.
So of course when ChatGPT came out, all of the kids were very surreptitiously like, "Oh no, I'm not using ChatGPT for my history essay." And all of the English teachers are like, "This is the devil, what are we going to do?"
As the engineering robotics teacher, I was thinking, well, what we need to figure out is how do I make my kids understand how to use this as a tool to get the best possible output? And so I was wondering if you had any thoughts on ways of training the kids or teaching the kids how do we use this as a tool and just see this as a tool, and the human is still the most important part of the equation. We're not replacing the human, it's like automation in our factories. We're not replacing the human, we're giving the human a different role and teaching them how to use a tool. Do you have any thoughts on that?
Yeah, so I think there are lots of wonderful ways that we can use these large language models for education. Everybody can have a personal tutor, for example. So here's something that we stumbled upon. I have another project called AI for Georgia. We're developing a middle school AI elective for Georgia middle school students. One of the things we came up with was this idea of having them design a robot. We're teaching them about computer perception, how self-driving cars work, and how different kinds of sensors work.
In this multi-stage process of designing a robot, one of the things we asked them to do is to figure out, first of all, what's your robot going to do? We gave an example of a robot whose job is to mop up the floor in the cafeteria. Okay, so you have some tasks you want your robot to do. Now, what kinds of sensors will your robot need in order to accomplish this task? Since we've taught them about different kinds of sensors, they were able to come up with some answers. Then, how will your robot interact with people and so on, right? There's this whole series of design choices that we ask them to make as they think through what their robot is going to look like and how it's going to function.
So I got this idea. I said to ChatGPT, "Pretend that you are a robot whose job is to mop up floors and have a discussion with me." Then I asked it, "What sensors do you have?" It came back with this very reasonable discussion about the kinds of sensors that a floor mopping robot would want to have. We're going to have kids do this. As they imagine their robot, we're going to have them say, "Well, let's make ChatGPT play the role of the robot that they're designing and let it give them more insights, or let them test their ideas, right?" They could say to ChatGPT, like, "Do you have a radar?" And it might say, "Well, you know, radar doesn't really do much indoors. I think camera input is going to be more useful for my task," for example. It's actually quite sophisticated in discussing its life as a hypothetical floor mopping robot.
I think you're going to see things like this all over the place, right? In history class, suppose you'd like to have a conversation with George Washington, right? ChatGPT can roleplay that. The ways that we're going to put this to use, we're still figuring that out, but they're so positive, they're so strong, that worrying about some kid getting ChatGPT to write their essay for them, that's not going to be a significant worry. We'll find ways to deal with that. But the positive payback is so huge, it's really exciting.
Any other questions? Any other questions? No? Okay. Round of applause.
(audience applauding)
(warm music)
Thank you all for your attention and participation. It has been a pleasure discussing these exciting developments with you.
Share
Like this video? Share it with others!
Additional Resources
Like this video? Discuss it in the VEX Professional Learning Community.