Building LLM based solutions with Snowflake by Pooja Kelgaonkar
Pooja Kelgaonkar
Senior Data ArchitectReviews
Unlocking the Power of LMS Solutions with Snowflake Native Services
Hello everyone, and welcome to our insightful session on leveraging LMS solutions using Snowflake native services. Today, we’ll dive into how Snowflake's powerful features can streamline your implementation process and enhance your data capabilities.
Introduction to LMS and Snowflake
The LMS engine AI has garnered significant attention in recent years. With various data platforms offering these solutions, Snowflake stands out as a leading choice. What sets Snowflake apart is its native features and services, enabling users to develop sophisticated machine learning (ML) solutions effortlessly.
About Me
I'm Puja Pilankar, a Senior Data Architect with Rackspace, and a proud Snowflake data superhero for over three years. I have authored a book titled Mastering Snowflake Platform and am currently working on my next book, expected to launch soon. My passion lies in all things data, and Snowflake remains my favorite platform for exploring new data solutions.
Snowflake's AI & ML Capabilities
Snowflake provides built-in functions categorized mainly into two areas:
- Snowflake Cortex: This umbrella encompasses many pre-built LLM models readily available for use within the platform.
- Snowflake ML: This allows for more customization, where users can build models from scratch or integrate existing models seamlessly.
Key Features of Snowflake ML
- ML Functions: Two main categories of ML functions: Built-in functions designed for analytical applications and customizable models for specific needs.
- Model Registration: Snowflake offers a model registry, supporting CI/CD for effective deployment and maintenance.
- Cost Management: Understand cost implications based on data storage and compute usage, guiding efficient resource allocation.
Exploring Snowflake Cortex Functions
Within Snowflake Cortex, various LLM functions are available. These models can be utilized through simple SQL commands, making integration straightforward. Some highlights include:
- Sentiment Analysis: Analyze the sentiment of texts effortlessly.
- Text Generation: Generate high-quality text based on prompts.
- Summarization: Condense large bodies of text into meaningful summaries.
Cost Management in Model Training
Understanding the costs associated with Snowflake’s usage is essential. Key considerations include:
- Data Storage: Costs correlate with the volume of data being stored and processed.
- Compute Resources: The type of Snowflake warehouses being used impacts overall cost—especially with ML and Cortex operations.
Practical Demonstration
In our session, we accomplished the following:
- Utilized the complete function to engage with various models such as Snowflake Arctic and others to demonstrate text generation.
- Implemented sentiment analysis on sample movie reviews, showcasing dynamic interaction with data directly within Snowflake.
- Executed text translation, illustrating the multilingual capabilities available within Snowflake's framework.
Governance and Security in Snowflake
When integrating AI solutions, questions surrounding governance and security are paramount. Users can establish guardrails to prevent the model from generating harmful outputs, ensuring safe and responsible AI usage.
Conclusion
We hope today’s session has provided valuable insights into implementing LMS solutions using Snowflake. As you explore these advanced features, consider how they can fit into your existing data architecture to drive transformative results.
Stay Connected!
If you're interested in learning more, you can order my book from Amazon, and I invite you to connect with me on LinkedIn. Thank you all for joining today’s session—we look forward to seeing how you leverage Snowflake for your LMS solutions!
Video Transcription
Hello, everyone. Welcome to the session. Hope you are enjoying the summit and going to the good sessions overall.Let ping me if you have any you see any issues with my audio and video. Okay. Cool. So I'm switching the screen to the presentation mode. But is it is it visible? I believe so. It's it's good and it's coming in a presentation mode. Alright. So I'm going to talk about implementing an LMS solutions using Snowflake native services. The LMS engine AI has been a hot topic for a while now, and most of the data platforms that are that are offering these solutions So one of the one of the leading platform is a Snowflake data platform, which offers some of the native features and services using which we can improve developing solution.
We don't see the full screen of your presentation. You can you go to view at the top?
No. This is how it is. It it's visible now?
No. It's it's not because the present the presenter mode opens in a different window, and you're sharing the PowerPoint window, which means we don't see the next window which opens.
Okay. So let me stop sharing and share the entire screen. Sorry.
Okay. That would work.
I'm sharing my entire screen and in the presentation mode. So is it is it okay?
It's okay.
Thank you so much. Alright. So, Snowflake, is one of the leading data platform which offers LLM, integrations and solutions as a part of native features and services, which can be just plugged in as a sequel and very easy to implement irrespective of worrying more about the training, setting up the infrastructure, setting up the models, and maintaining them.
So with this session, we are going to see some of the offerings that Snowflake have and a quick demo. I know how easy it is to use these models in a SQL ish language. So to start with it, let's start with my introduction. I'm Puja Pilankar working as a senior data architect with Rackspace. I have been a Snowflake data superhero for the past three years now. I have one of my book, already published last year. That is a mastering Snowflake platform. And another book is coming this year, so maybe sometime around this quarter, it will be there in the market. I love everything that deals with the data, and that's where I love exploring various data platforms. And Snowflake is one of my favorite data platform.
So talking about Snowflake AI and ML. So as I mentioned, it offers the built in functions, and it is categorized into two of them. So one is the Snowflake cortex umbrella, in which we we can see lots of the LMM built in models that are made available as a part of Snowflake platform. Another one is a Snowflake ML where we can use Snowflake ML capabilities along with the Snowpark ML and build a model from scratch instead of using something that are already made available as a part of Snowflake functions. So Snowflake ML is the, is the part of the, you know, umbrella where we can build in the, ML functions, models using a customized code. And we can also implement, it like a bring your own models. So let's say I already have a model existing and running in a different environment. I can get the same model and bring it in the Snowflake.
Just plug in with the Snowflake ML libraries and get all the requirement required prerequisite set up for the model along with the required data on the platform, and then I'm good to use the model. So Snowflake also have a ML workflow where we can register a model in a model registry, a typical CICD that is offered by any other cloud services. So we can have a similar CICD training deployment, model registry, and maintenance done, within Snowflake platform itself. And as a part of this, it offers functions, and these ML functions can be categorized into, two of them. So even ML have two bifurcations, like b by o m where we can begin the customized stuff and implement within the platform. And another is similar to cortex LLM functions. ML functions are also embedded and made ready to use functions specifically for the analytical functions and time series.
So as of now, there are two of the analytical functions that are available, which which are, like, classifications, which typically runs on a classification model to classify the entities. And the second one is the time series, where we can use it for anomaly detection and forecasting. So with with these, some native, ML functions, Snowflake also offers a cost, you know. To give you an example, cost anomaly detection. So depending on the on the usual Snowflake cost strain, there is a cost anomaly detection, function also introduced, which would help customers to, you know, get alerts and notification depending on the spend so far. So we no longer need to I need to go and check it manually how much cost it has been. So we can directly use those functions and which would generate alerts and notifications for us.
So these are the typical, you know, I would say the the regular ML functions. Coming to the cortex, the today's topic more is on the cortex and the LLM functions that are made available on Snowflake. So as we already, know I'm sure most of you guys are already using Snowflake or started to use or already using it. So, Snowflake offers, LLM functions as a part of Snowflake Cortex offering, and these functions are made available, in some of the select region between AWS and Azure. And we can use these functions, and there are some built in LLM models that are integrated. So not all models are supported, but most of the, the famous ones are available. Snowflake also have their own LLM model being, made available. That is a Snowflake Arctic.
So we can choose the model, when we want to use the function. So I'll I'm gonna show you that during our demo as well. So, yeah, we discussed on the cost ML functions. So we already, have these functions made available, on the Snowflake. But when it comes to using them, the majority, of the cost, usage is dependent on the storage and compute. So when it comes to storage, it is the data that is being stored on the platform and how much volume of the data we use for the training training purpose of the model. So every time we build a model, we train it and we store it. Every time it's gonna cost us. So it really depends on, how many how much volume we are using to train and use these models.
And, of course, the compute because, ML and the cortex l LMM, we, you know, prefer to use the Snowpark, specific warehouses, whereas the cost associated with these warehouses vary in a comparison with the regular warehouses. So definitely, we wanna keep a eye on how much each function costing depending on the volume and maybe get rid of some of the unused models where we are not using it anymore to avoid any of the storage cost associated with them. So, coming to the cortex functions, as I mentioned, we have made them available. Snowflake has made them available as a default function, so no additional setup is required. We don't need any infrastructure to be set up to use them. These are simply made available as a SQL function. So just like any other SQL functions, we can just use it as a part of our SQL and, run on top of the data that is residing within the platform. So we are gonna see this during our demo, how we can use a model with the help of simple SQL functions.
Talking more about functions, so there are there are two categories of the functions, complete and task specific functions. So there are bill built in functions that can be categorized. So complete function can be considered, you know, considered as one of the function that provides wide range of the, operations. We can use use it for any any any text generation. We can use it for, sentiment analysis. We can use it for classification. We we can use it to summarize the text and, give us the details of any any text that we can you know, we are providing as input to see how complete work, how it can be used for sentiment, reviews, and classification. Task specific functions are the functions that are built for some specific task and whereas, we can use them to automate. And then the third category is the helper functions. So these are also some of the built in functions, that we can use it during the, during the failure of these functions.
So this is little bit more on the complete. So this is just to introduce how complete works and some examples, but we are going to see this in the in the demo as well. So it can be used for classification, text generation, summarization. So we can literally use complete to implement any any kind of AI GenAI use case until and unless we have specific, you know, we want to use any specific functions like task specific. So, it is further. So complete gives us, it's one function that gives us multiple, ways to implement. Whereas task specific, we can have separate functions that can be used for sentiment analysis. We can use it for entity recognition, then separate functions for classification, translation.
So complete also does the same, but these are some specific functions built in which are trained, to use to implement the functionality in the same category. So we can also use sentiment to get a sentiment, and complete can also, you know, use the same. Then helper functions. There are two of the helper helper functions available, count tokens and try complete. So count tokens gives us based on to, what input we are providing. It gives us the token count, based on to what model and function we are using. And try complete is just like a complete, but it returns a null when, the function cannot be executed. So, generally, if we use complete and it fails, we get we get an error.
But in case of try complete is instead of failing the function, it gives us a, null as a return code. So let's go to the demo. I'm going to bring up the snowflake. Okay. Let me know if this is visible. I can zoom in a bit. Okay. I hope this is visible. Let me bring it down. Okay. Cool. So, we are going to see some of the generic, usage of complete, and then I'm I'm gonna create a table and store some of the review comments to run a complete on top of it. So here, if you see, this is how we can refer to a cortex function, snowflake dot cortex dot function name. And here, I'm using complete and, the model name. So Snowflake supports various models, llama, Israel. We can pick up whatever model and version we want to use. Here, I'm using Snowflake Arctic. This is native to the Snowflake.
And if you see, this is the prompt. So we have these are the two parameters to be provided to the prompt as a parameter to the function so that it knows what model to be used and what prompt we are using. So if I run this one, it would give me the results same as any chatbot, you know, results. So it gives me a result saying that what what is the element, details about it. Okay? The next one is we can also set the tokens and the temperature when we use a model. So here, in the second example, I'm using a LAMA model instead of Snowflake Arctic, and we can set up how many max tokens we want to pass to the function.
So these are some of the additional parameters that we can also send to the function and use it. So it gives us the detail in terms of, you know, result. Okay? Then the usual question we have is whenever we want to implement any of the JNEI use case in our existing platform, though, most of the times, you know, I get question like, how about the governance? How about the security of what we are using? How the data is being exposed and being used within the models? So there is another, example using which we can set the guardrails. So what it mean is this is one of the parameter that we can set up on top of a complete function, which would, which would avoid any of the unwanted, prompt, unwanted response, from the model.
So we know many many of the times model do hallucinate. So this also sets the parameter where it would avoid sharing any of the harmful information and details. So, again, the same. We can specify the model here. I'm using a visual model. And then the content, the question, and, guard release two. So the moment I set this up, if anything comes as a harmful response in the model I know this prompt may not have a harmful response. But if we are putting up any prompt which may have a harmful response where model may hallucinate, so it would avoid any any of those responses when we receive it as a, you know, function output. Next is the sentiment analysis. I would like to show you this on top of a table instead of providing a hard coded prompt. So, I have created a table which is a movie reviews, movie name, debut, and rating.
And I have provided text here, which gives, gives, you know, a brief of a movie and a brief of a review. So I got this from one of the, movie sites and created. Now let's see, how this runs. Okay? So we can run this on top of table directly. And as you can see, the select and from clause remains the same, and I'm using it just like any other function. So what movie review mentioned? Okay. So the movie name is the Arthur the King, and the score, it says that it's 0.076. Okay. And let's see what sentiment it has. Okay? So if you see, I can use the same to get the sentiment. So let's get this. So let's get the sentiment and details from complete, and let's use the sentiment function to compare. Okay? So complete can also be used to get it.
So movie review is the column where I'm storing the review of the movie. So movie review, and this is the table name too. So what it says is, generally, the review is positive highlighting the films theme. Okay. So it is saying overall, it is positive and the reviewer appreciates. Okay. So it's giving overall positive sentiment. So let's run the same, with the sentiment function. So this is the specific function, task specific function. Okay. So It is movie reviews. Movie name is not there, so let's get rid of this. Okay. It has given the score, but it haven't said that whether it has it is a positive review or not. Okay? Then in case of, summarization, it had a better review definitely with with earlier, we had a score of 0.07.
So in terms of summarization, it would read the content from movie review column and it would summarize. So it has summarized what the what the column value was. So instead of a long review so if you see the text was quite big, and it has summarized the review content. Then translation. So I'm asking to translate from English to French. So as of now, translate doesn't support all the languages. We have it, available. It supports some of the languages. We can see what all languages are supported in the documentation. And if you see, it has converted from English to French. So the same review is being translated. Alright. So this is all in terms of, demo. If there are any questions, I'm happy to take. Yes. And before we say bye bye, for the session, these are the details. So if you want to order my book, you can order it from Amazon.
It is also available on PPP online. And if you wanna connect, you can connect me on LinkedIn. So thank you all for joining in. Hope you have enjoyed the session. I know it was fast, but I tried to cover as much I can as a part of demo and and the presentation. Thanks, everyone.
No comments so far – be the first to share your thoughts!