ChatGPT is a type of Generative Artificial Intelligence technology. Generative AI is a neural network-based model that has been trained on a large dataset of text from websites like CommonCrawl, WebText, Wikipedia, and a bunch of books to perform natural language processing tasks, such as text generation and text classification. It can produce human-like text and respond to prompts with high accuracy, making it useful for various applications like chatbots and virtual assistants.
The math behind ChatGPT is what allows it to pick up on the complex patterns of the human language. For simplicity’s sake, I will not dive too deep into the math and will try to give a high-level explanation of what makes it so powerful. The “GPT” stands for Generative Pre-trained Transformer, which is a specific type of neural network that takes input information and passes it through a complex network of interconnected processing units to generate text responses. While it does not save humanity from Decepticons, it might very well save us from extra grunt work.”
A Transformer, unlike a traditional neural network, processes text input in parallel rather than sequentially. It transforms human words into numerical representations using a method called ‘Multi Headed self-attention. This method enables the model to read a sentence or prompt the same way we do, word by word in sequential order. However, it will refer to its pre-trained memory from all the data it took from the internet and realize that some words are closer in meaning than others. For instance, the model will notice that ‘Michael Bay’ has a much higher relation to ‘director’ than it does to the word ‘from’. This way, whenever the model stumbles upon the name ‘Michael Bay’ it already knows it is closely related to the word ‘director’ and will denote the strength of this relation with a number called the Weight (this is the self-attention part). Now, the multi-headed component of the model is repeating this process but for multiple words at the same time as it is reading the prompt. It is a way of continually updating the understanding of a prompt based on the new words it encounters.
OpenAI implemented supervised learning into ChatGPT, in which humans manually feed the model the correct answers allowing it to learn what correct answers look like for future predictions. They also incorporated reinforcement learning, a type of machine learning framework where the model learns to make decisions based on trial-and-error feedback, in order to improve its ability to generalize beyond what it has been specifically trained on. This way, ChatGPT can receive feedback and adapt its responses based on how well they are received, allowing it to improve its overall performance and generalize to new unknown prompts. This is all done in a 3-Step process:
There are many more applications for this generative AI technology, ranging from business products/services to everyday tasks. As ChatGPT becomes more mainstream, it’s important to be mindful of how to ask high-quality questions/prompts in order to receive high-quality responses. The more details (data) that go into the question, the better it is able to extrapolate meaning, therefore resulting in a better response.
As you can see, the AI is good at summarization, but it fails to add any final takeaways that will leave the reader satisfied, even when specifically prompted to do so. This is why we only recommend using it for specific problems like the ones aforementioned. ChatGPT can be used to explain a technical idea in simple terms or as a jumping-off point for research. Keep in mind that as it is currently rife with misinformation and lacks nuance, don’t go writing your entire Ph.D. dissertation with it. Remember, this is a new technology that’s still in beta form and like the majority of all AI solutions today, It is important to remember that ChatGPT should be used as a tool to assist with your work, rather than a replacement for it.
Nick Scipione is a Data Engineer at Valkyrie, where he is also a member of the Algorithm Accountability Task Force. He earned his BS in Chemical Engineering from Northeastern University in Boston, MA where he learned how to use data science to solve a wide range of problems. He is excited to play an important role in shaping the rapidly advancing field of artificial intelligence.
Mo Salinas is a Data Scientist at Valkyrie. He earned his BS in Physics and Biophysics from the University of Texas at Austin where he worked in various research fields from astrophysics to molecular neuroscience. Over the past 2 years Mo has used his insights of natural phenomena to map them to data solutions with real world business impact.