Unleashing Creativity with Generative AI: From Prompts to Custom GPTs
Preeti Arora
VP of EngineeringReviews
Understanding Generative AI: From Basics to Custom GPTs
In the world of technology, generative AI has rapidly gained momentum, making waves across various industries. Today, we explore its nuances, including essential concepts like prompt engineering and the powerful capabilities of Custom GPTs.
What is Generative AI?
Generative AI refers to algorithms capable of creating new content, such as text, images, or audio, based on patterns and data learned during training. The evolution of this technology can be viewed through several layers:
- Machine Learning: Basic algorithms enable computers to learn from data.
- Deep Learning: Advanced neural networks handle complex tasks with feedback loops.
- Generative AI: The latest paradigm where machines create unique content derived from pre-trained models.
Using large datasets, generative AI, particularly through Large Language Models (LLMs), can generate sophisticated outputs, including text in multiple human languages, music, images, and even videos.
How Does Generative AI Work?
- Generative AI models require extensive training data that serves as the foundation for learning.
- The models identify underlying patterns and structures within the training data.
- Using these patterns, the models then create new content that often mirrors the characteristics of the original dataset.
However, caution is necessary. These models can exhibit biases based on their training data. Thus, ethical considerations are crucial when utilizing generative AI solutions.
Key Types of Generative AI Models
Researchers have developed various generative AI models, including:
- Generative Adversarial Networks (GANs): Used for tasks such as image synthesis and data augmentation.
- Transformer Models: Fuels natural language processing and excels at parallel processing.
- Diffusion Models: Focus on generating high-quality images by removing noise.
- Variational Autoencoders (VAEs): Encode and decode complex data for various applications.
Diving Deeper: Large Language Models (LLMs)
At the heart of many generative AI tools lies the transformer architecture, which optimizes natural language processing. LLMs like GPT-3 and GPT-4 utilize intricate neural networks to process and generate human-like language.
Understanding Prompt Engineering
Prompt engineering is the art of crafting effective prompts to extract desirable outputs from generative AI models. Elements of effective prompts include:
- Clarity: Make the prompt easy to understand.
- Specificity: Clearly define the intent of the task.
- Context: Provide background information to guide the AI.
Types of prompts can vary significantly:
- Zero-Shot Prompts: Direct instructions without examples.
- One-Shot Prompts: Include one example to inform the model's output.
- Few-Shot Prompts: Provide multiple examples for more nuanced responses.
Hands-On Experience with Generative AI
For those keen on practical application, platforms like Kaggle allow users to explore generative AI through easy-to-use interfaces. Here's a basic outline of how to get started:
- Set up an environment with required libraries.
- Use example zero-shot prompts to gauge the model's responses.
- Experiment with parameters like temperature to control creativity.
Introduction to Custom GPTs
Custom GPTs are tailored versions of GPT models with specific instructions for particular tasks. Here's how they can be utilized:
- Automate daily tasks like summarizing emails or managing documents.
- Enhance personal productivity with tailored functions.
- Leverage artificial intelligence for greater creativity on social media.
To create a Custom GPT, having a ChatGPT enterprise license is necessary. Once set up, users can define commands and customize their AI’s capabilities without losing access after canceling
Video Transcription
I am Preeti Arora.I am working as, vice president of engineering at Deliveroo, and, I've been in the IT industry for more than two decades, leading multiple digital transformation and strategic technology, interventions at various b to b and b to c companies. I've been at the forefront of driving multiple, business OKRs for Fortune 100 companies like, Walmart Global Tech and Atlassian. And since the beginning of this calendar year, I've been working at Deliveroo, driving restaurants and new verticals, growth for Deliveroo. Quick disclaimer. Every part of the content that I'm covering in this session today, these are all my personal learnings and personal views, and they have nothing to do with my my professional work at Deliveroo. And this session is purely meant to be a learning session for those of you who are super excited to be in the space of generative AI.
So with that, without much ado, let's get into the topic. So topic today, it's as as we all know, it's about all about generative AI from prompts to custom GPTs. While I totally understand that many of you may have already been using generative AI in terms of either using ChatGPT or Gemini or Perplexity or many more such, you know, Gen AI tools, Today's session, what I thought I think it'll be useful would be, let's just get a little under the hood.
Let's understand what's behind all this, exciting space, from a tech architecture and from a tech details perspective. Let's also get to understand fundamentals of prompts prompt engineering and also see, you know, some cool things about how we can configure LLMs to do multiple kind of, things, in a programmatic way. And one of my personal favorites, also, let's see, you know, what kind of cool stuff custom GPTs, which is a recent, feature from OpenAI. Not very recent. Few months back, it was launched. What all cool stuffs custom GPTs can do for you. So that's the kind of, stuff that we'll be covering over here over the next thirty minutes or so, and I'll be leaving last few minutes of this, session for q and a. And, we'll we'll take it from there in terms of taking your questions as much as I can.
And in case any questions still linger in your mind post this session, feel free to connect with me on LinkedIn and do a DM, and, I'll be super happy to take your questions individually as well. Alright. So what exactly is generative AI? There's been so much buzz about this, entire technology paradigm for last couple of years already, and I thought it'll be be really useful for us to just do a little basic one zero one. Well, artificial intelligence, as we all know, it's basically the science behind having machines, to think and act like humans at a at a very, you know, high level kind of a laymanish language laymanish, you know, meaning of of artificial intelligence. As we go deep into this whole discipline, what I eventually learned in terms of how the mathematical and statistical models have been really building one on top of the other is it all started with pure algorithms around machine learning, wherein all we were trying to do was computers should be able to learn from the data without human programming intervention, understand the patterns of data, understand the categorization of data, and let us, you know, take some meaningful interpretations about the data.
Let us do some kind of predictions about the data, and that is all about that's that's what we call as machine learning. And then as we further refined the algorithms of machine learning, what then further started emerging as a discipline was deep learning, wherein neural networks were basically trained to perform more complex tasks. And and and and there was feedback loops in between to ensure that machines could do more than just interpreting or categorizing or predicting data. And then as we keep kept refining the algorithms, there emerged a newer discipline or a newer science, paradigm known as generative AI, which is nothing but the ability to generate new text, new images, new videos, or any kind of content using the data that the underlying models have been pretrained on.
So that's that's how things have evolved over the last several years. So generative AI can learn complex subjects. It can it can basically learn any of the human languages, be it English, German, French, Spanish, any other language that comes to your mind. It can learn human languages. It can learn programming languages. It can learn the creative subjects of arts and and the harder subjects of science, chemistry. Chemistry used to be my harder subject. And, essentially, under the hood, generative AI is using models which we've come to know as large language models that are trained on huge datasets. And these models essentially use multilayered neural networks to identify pattern structures within the data being used to train the models. And then, effectively, these models are able to generate new content content of different type, even even music, even audio, not just, you know, text or images for that matter, even videos.
So how how does it work under the hood again? So, essentially, there has to be, loads and loads of training data to be used, which basically serves as as foundation for the learning of the model. And then the model basically does it works in terms of recognizing essential patterns and underlying the structures within the training data. And using the patterns that the model has learned, the underlying model starts creating new content. And and that content basically also has a bias towards reflecting the characteristics that it has learned from the training data. So just a word of caution. Yes. Generative AI has its biases. It can hallucinate as well. And I'm sure you all know these, basics.
So we there is a sense of responsibility that we all have to apply while while using generative AI. And and and the reasons are that at the under the hood, it is basically using input data for the models to be trained. So if your input data or training data is gonna be a well balanced set of data, your results are gonna be much more well balanced. Needless to say, if your underlying data is gonna be more skewed towards certain biases, your outputs are gonna reflect that skew. Moving along. So there have been multiple kind of generative AI models that researchers have, been able to develop over the last several years. And out here on the slide, I'm just talking about some of the very key, key kind of models that have emerged over the last few years and have taken significant, front footing, for people to do different kinds of use cases.
GANs or generative adversarial networks are basically increasingly used for image synthesis, data augmentation, synthetic image generation. And then we have transformer models. These essentially are are basically being used for natural language processing, text generation kind of, aspects and are very adept at doing parallel processing. We also have diffusion models, which basically attempt at removing random noise to generate very clear, detailed, images and are at the backbone of, high quality image and video generation generative AI tools. And then we have something called as VAEs or the variational autoencoders that basically encode, complex data and then decode it to generate new variations, to generate images, to do image comp compression for audio generation, for video generation as well. So that's the kind of, high level of key different types of generative AI models. And and as as as as you may just, you know, be able to connect the dots by yourself as well, every other generative AI tool that you are already using or have been very excited to use in the coming days is based based on top of one of these kinds of models under the hood.
Now just because it did say that we'll we'll try to get a little more under the hood today. So one particular, aspect that I thought I should, you know, cover a little bit more in today's session is let's understand LLMs or large language models, since lot of revolution that came into the space since the advent of chat GPT is is based on LLMs. The LLMs, are nothing but these are models that are, basically adept at processing natural languages. And for these kind of models to really become effective, the underlying architecture is transformer based architecture, wherein, the input data is basically a good collection of tokens. And these tokens basically get processed by the underlying model. And there is a feed forward loop, and there's an attention loop. And that and and these loops basically are nothing but implementations of advanced neural networks, multilayers of of, dense layers of neural networks.
And then they produced, they produce output that has to be decoded, by the neural network itself. And and this this transformer based architecture is a fundamental building block for all kinds of LLMs, be it GPT three or GPT four row models or the LAMA models or, the the models being used by Meta, or Google also has produced Gemini models. So all of these different kinds of LMM models that have existed for quite some time now, they most of them, I would say, they have, the transformer based architecture as the underlying backbone. This concept was introduced few years back sometime in 2017 wherein and and you can read more about them in Google's famous paper, attention is all you need. For those of you who are really into the depths of this technology, I would recommend this paper as a must read. It's a very interesting read. Moving on. So that's that's, like, a very high level of under the hood what exactly is shaping up, these these generative AI tools.
Now over time, each one of us has been hearing so much about prompts, how prompts are needed, how nicely curated prompts are needed to get very good outputs from any kind of generative AI tools. So let's let's dig a deep a little deeper into what exactly is meant by prompt engineering. So in in simple laymanage terms, again, it is the art of designing input prompts to guide your generative AI model. Now it is very crucial that we curate these prompts to maximize the effectiveness of your generative AI tool. And, especially if your tool is is, NLP kind of, Gen AI like ChatGPT or like Gemini, then it's, very important that your prompts are, are really curated properly. So let's let's just, you know, see what are the key elements of of effective prompts. An effective prompt has to be clear. It has to be specific.
The intent of the task that you want the model to to to perform should come out very clearly in the prompt. Providing some context to the prompt, helps the underlying, Gen AI model or the underlying LLM model to produce a relevant response. So providing adequate background information is always a good tip for your prompt. And the other bit also is, a good prompt engineer would typically be brief, but at the same time, comprehensive. Give enough detail without really using an an overwhelming jargon of words. So keep it detailed in terms and concise and brief, and and that's an art, not necessarily a science, I would say. Types of prompts.
We can use zero shot prompts for any kind of Gen AI model, or or tool. And, essentially, zero shot prompt really means is you just give a simple instruction to the model without any examples, and the model just uses its existing training knowledge to generate the responses. One shot prompt where you'll give a particular example to guide the response, and just one example, And then your, Gen AI tooling, or or the underlying LLM model will use this single example as a reference point to guide the desired output. And the other kind really can be few short prompts where you're giving at least two or more than two examples to clarify the kind of output that you are expecting from the GenAI model. There are more prompt types. They can be instruction based prompts with wherein you're just providing a sequence of instructions. Step one, do this, step two, do this. And and then there can be contextual prompts wherein for better relevance with every instruction, you keep on providing context.
And then there are more prompt types as as well, but this is just, again, a high level. We'll we'll cover some of the other prompt types, in some kind of a practical, hands on exercise that I do want to include, as as as, you know, further this thing. Alright. So, yeah, so that brings us to the practical hands on piece, that I personally have been very excited. Those of you, who are new to to lot of this this, kind of, you know, aspects in terms of trying things hands on, don't worry. Not to worry. I have, basically put up a sample file consisting of Python instructions, which is a publicly available file, and I'll put the URL to access that file on the chat window, right after I finish my stuff. So don't don't feel pressurized to take a lot of notes. Just just be with me. See what all magic, can be done, little with little bit of programming, understanding.
And you don't really have to know programming. You can always use Gemini code assist to help you help you even write the code, to be able to to customize your, LLM, to be able to write better prompts and whatnot. So with that, I'm just gonna change my screen sharing mode, and it's start seeing something more exciting out here. Alright. Yes. So, out here, those of you who do not know, let me just first check if can everyone view? Yeah. Looks like everyone can view. And I know there are some questions coming on the chat. Thank you you team for, asking all the questions. I'll I'll come back to the questions, in a in a little while. There's some exciting stuff that I just wanna show in terms of hands on.
Those of you who do not know what Kaggle is all about, Kaggle is is a simple, free to use, platform for learners of machine learning, learners of AI. And I personally like to use in in my, you know, personal hobby time kind of a thing just to do some quick hands on for stuff, you know, that is not related to my my day to day, professional work just just to really keep on doing my my own, you know, learning by the side. So what I've done over here is I've I've just produced this, cabinet file, and I'll be giving out the URL for this, public file as well so that you all can try by yourself. But, essentially, in in this particular hands on exercise, what I'm really doing is I'm using Google's, Gemini Flash as the the the base LLM to be able to show what all, stuff can be configured for an LLM to get better results or different kinds of results. And, also, we'll see the same, you know, file being used, to show multiple different kinds of prompt engineering examples as well. So all I'm doing is just a pip install, Google Generative AI so this so that, Kaggle is able to load Google's Generative AI models, basically, into the into the running workspace.
And nothing fancy out here, and then provide basically a Google API key, so that, I do have the right permissions to run, everything. And and nothing out here that I'm using requires you to spend any penny, out here. This is this is all, free stuff that I'm using, so it should not be too hard for anybody to set up. But as I said, in case you all get problems, post the session also, I'll be very much reachable via a DM on LinkedIn, and I'll be super happy to help you out. So out here, what I've I've said, really is that I want to use the Gemini 1.5 flash model, as as as the LLM model for this particular exercise. And why I'm using this model just a little bit? This is a lightweight model.
And, since I'm using this whole thing on my own, MacBook, I really didn't want it, out of memory kind of situations to occur. And and for personal pet projects kind of fit, Gemini 1.5 Flash is is just particularly apt. If you have any kind of production workloads, I would I would suggest do your own research and you'd use the right, LLM model. The way to just start giving, any kind of prompts to LLM is very simple. Just just, you know, put this, simple prompt. This is essentially an example of a zero shot prompt explaining AI to me like I'm a kid and and look at the kind of response that, this model just produced. And then if I just want to implement some kind of, chatbot kind of a thing wherein I'm just typing something to the chatbot and the chatbot intelligently starts, you know, responding to me, this is how, typically, you know, you would go.
I'm just, you know, telling the the model that my name is Preeti, and the model has basically started to, you know, interact with me. It is basically greeting me. And then I'm asking, you know, some question again, and and, it just, you know, starts answering just like a conversational chatbot style, kind of an interface. Now Gemini gives a whole lot of models for people to use. So this is this is more of, again, inflammatory stuff where and I'm just listing what all, Gen AI models, Gemini or or slash Google is is, having available, and this is, just nothing but a simple list coming out there. Few things, basically. So when it comes to using these LLMs, programmatically, there's lot more that can be done. You can actually, you know, tell what kind of maximum output output or minimum output that you want from the LLM in the response that gets generated.
Like, over here, I'm giving a a very, you know, clear prompt. Give me a thousand word essay. Right? Or little bit subjective kind of a thing, but still have I'm limiting the output token saying, write me a short poem, on on the importance of of olives. Another beautiful parameter that I really would want everyone to just, you know, put put, little bit of, you know, energy and attention into understanding is temperature. Every LLM, exposes this configuration called temperature. And, basically, in simple words, it controls a degree of randomness in the output being produced by the LLM. Higher the temperature, the more varied will be the output of your LLM. And lower the temperature, the more consistent, or more definitive the output of your, LLM would be. So in this particular example, I've just, you know, used, high temperature kind of a variant and say, okay. Pick a random color.
And you can see out of, you know, the five times that I'm I'm executing this, it did, you know, put a different color at least one time. So there is there is that variation. But then if I use a low temperature kind of a model, and, again, out of the five times that I execute, each time, because I have set the low temperature, the output is consistent. So when when you are trying to use these LLMs in a programmatic environment and, two kinds of use cases, you wanna be more creative with the kind of output being produced, use high temperature as the model configuration. When you want to have definitive consistent outputs coming out from your LLM, use a low temperature, setting. Top k and top p is another, interesting parameter to control the diversity of your model's output.
I would just, you know, leave the top k really as as, it's, as as more important. Top p, I think, requires more study of the underlying probability theorem, which many of you may just find a little boring of a concept. But top k is is more interesting. Like, it it just defines a number of most probable top tokens that you are telling the LLM to select the output from. And, again, some more yeah. So interesting some examples on on the the prompting part of it again. So out here, we did talk about zero shot and and and and, few shot kind of prompting techniques. So zero shot is a simple instruction. You're not giving an example. But let's see a few shot prompt in, you know, in you know, from a program thing from a programmatic perspective.
What I'm telling over here is, I I want to know a particular order, a pizza order to be passed into a valid JSON. This is my my, you know, example that if this is the customer order small pizza with cheese, tomato sauce, pepperoni, the JSON response should look like this. What is the size? What is the type? What are the ingredients? And then I give it another example. And and then again so these examples are basically gonna train your LLM. And now when I actually, you know, give give a particular, you know, customer's order, give me a large with cheese and pineapple, it basically generates a JSON using the examples that I use to train it. So this is an example of a few shots prompt technique being used over here.
There's another technique, chain of thought technique, to be used, wherein it's more of you keep giving it some some prompt, and then you keep building on it in terms of step. As in the first prompt is I'm just, you know, trying to give it a puzzle, and then I wanted to think step by step. And in in terms of those steps, I really want to keep start start, you know, making it prompt with further reasoning. So your chain of, you know, thought thought prompt technique. So these are the kind of, you know, techniques that you can use for prompt engineering, and I'm gonna leave this file with you all for, more interesting, self, learning come self research as well. Another aspect that I do want to cover, which is quite exciting as well, is building up of custom GPTs.
And, I'm gonna again switch my screen sharing to something, else. But before I do my screen sharing to show the custom GPTs, so custom GPTs essentially is a feature from OpenAI for chat GPT. Now this feature requires you to have chat GPT enterprise license. I would not outrightly say everyone should just go and purchase an enterprise license of ChatGPT. If for your personal or business use case, you do see value addition in it, definitely go for it, and you can always, be in that zone where you've created certain custom GPTs that are required for you to become more productive or required for you to automate some of your use cases.
And then you give up the license Because after you've also canceled your license, the custom GPD that you already created and published, they continue to be with you. So now let's just, you know, see that bit into action. Let me just do my screen sharing once again. Alright. Alright. So I'm just gonna. So simple, chat gpt.com. As I said, for creating custom GPTs, you need, enterprise license. What exactly are custom GPTs? Custom GPTs are nothing but GPTs that have been curated with, specific instructions slash natural language prompts that you have given with a particular task or set of tasks or a particular automation in mind. And and and, essentially, there are multiple ways that you can use, custom GPDs for, for any kind of workflow that you want to be automated for any kind of personal productivity booster kind of things that you want to do or just for, you know, really getting, you know, some some some, I would say, some content, you know, being auto generated basis, you know, certain topics that you really, you know, like to really generate the content on.
There can be multiple kind of use cases. How does it all come into play is when you have a ChatGPT enterprise license, out here, you see something called as myGPTs. Now this myGPTs will basically show you some GPTs that you've already created, some GPTs that may have been shared with you, and then you can always create a new GPT for yourself. As an example, I'm not gonna go into the internals of this particular custom GPT, but I did create a custom GPT to do a daily digest of my professional inbox, my work inbox, just because my day is usually filled with back to back meetings. So I really find, you know, very less time in a day to keep on going into my inbox and reading emails and and, really determining which of them are important, not important for me to read or to respond. So I did, you know, write a custom GPT That, okay. Get into my my professional inbox on a daily 9AM IST basis.
Dump all of the emails into a particular Google Sheet. Highlight which of the emails are important. Also highlight which of the emails are necessary for me to respond in that day, and voila, custom GPD does that. Given that this is my my this GPD I created for my work, you know, this thing, I'm not gonna go into internals of this. But you can basically, in your regular life, create such a custom GPT for your own personal inbox, for your own work inbox, for any other kind of, you know, productivity challenge out here. Look at this. It is already, you know, prompting you. Make a creative that helps generate visuals for new products. Right? Make a software engineer who helps, you know, format my code.
I've heard of engineers who are actually, you know, using this kind of, you know, custom GPT interface to really, you know, have, an assistant software engineer by their side to write new code for some lower complexity features for them while they're working on higher complexity features, cool use.
I've had, people in my own personal network really use custom GPT for being very creative, and and creating a splash on their social media handles, be it LinkedIn, be it Insta, be it Twitter. What they've basically been doing is, putting commands on the custom GPT window to really say, okay. You know what? Take, today's, feed from Twitter. See which are the trending topics. Write me, you know, few very interesting tweets that I would want to post on my Twitter handle, and then custom GPT just, you know, produces those. And and what they really do is spend five minutes of the morning going through the output generated by this custom GPT and selecting, okay, push this particular tweet on my Twitter handle and voila. And they've been, you know, creating ripples in in their, you know, social media platform. So very cool users.
I've had a mentee of mine, use custom GPT to analyze her resume, and understand the gaps that her resume had with, particular, kind of jobs that, she wanted to apply and, have custom GPT then fix her resume and keep it, you know, really ready. And then she's, you know, just, used that work of custom GPT to apply to those jobs. So there's there's numerous, you know, ways that you can train, custom GPT. It's very simple. All you need to do is, give it command. Create me, daily summarizer for my personal Gmail account. Simple language instructions is all that it needs, and it will prompt you what further information it needs from you to really be able to do the task that you're telling to telling the custom GPT to do that. And and all of this kind of exercise can continue, in this kind of a create configure kind of a workspace, assuming you have, invested into getting, getting, an enterprise, GPT license chat GPT license. See, it's even asking the GPT a name. I'm just, you know, gonna give it very brief instructions. Yes. That sounds good. And it'll keep on asking.
And now once you've done all of this, you know, creating configuring, all you have to do is then, eventually, do a create. The create button will become enabled once all of your, commands to this chat GPT are done to this custom GPT are done. And then it'll enable the create button for you. And doing that create button lets you do two things. Either you can keep it out here. You see that. Right? As it's doing, it's trying to enable the create button. It will let you either publish this just to yourself or if it's, something more generic that you would want to share with your own family, with your friends, with your personal work team, you can always, you know, share it with others as well. Right?
I am not gonna do any of these changes out here, but as you can see, it just, you know, lets you invite, you know, who else you want axe want this custom GPT to be available to. And then once you've created all of that, your custom GPT becomes available here. Now at any point of time, if I'm gonna give up my enterprise license, I will still have this custom GPT be available for me, and that's the USB that OpenAI has created for you. So with that, I'm gonna stop.
No comments so far – be the first to share your thoughts!