Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding finetuning llms notebook #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions notebooks/finetuning-llms/meta.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[meta]
title="Fine-Tuning LLMs with Our Own Custom Data"
description="""\
This example demonstrates on how to fine tune a base model using our own custom data.
"""
icon="vector-circle"
tags=["beginner", "llm"]
destinations=["spaces"]
1 change: 1 addition & 0 deletions notebooks/finetuning-llms/notebook.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"cells":[{"cell_type":"markdown","id":"38214208-8d54-4f44-8333-58215904f1e6","metadata":{"language":"python"},"source":"<div id=\"singlestore-header\" style=\"display: flex; background-color: rgba(209, 153, 255, 0.25); padding: 5px;\">\n <div id=\"icon-image\" style=\"width: 90px; height: 90px;\">\n <img width=\"100%\" height=\"100%\" src=\"https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/vector-circle.png\" />\n </div>\n <div id=\"text\" style=\"padding: 5px; margin-left: 10px;\">\n <div id=\"badge\" style=\"display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%\">SingleStore Notebooks</div>\n <h1 style=\"font-weight: 500; margin: 8px 0 0 4px;\">Finetuning LLMs with Our Own Custom Data</h1>\n </div>\n</div>"},{"cell_type":"markdown","id":"7190e9db-efee-4af0-a608-68162118e583","metadata":{"language":"python"},"source":"## Fine Tuning LLMs with Our Own Custom Data"},{"cell_type":"markdown","id":"4a6043da-0b36-4438-8c34-ae392e26dd70","metadata":{"language":"python"},"source":"Gradient is a solution that provides a lot of LLM models that we can use for fine tuning. Singn up to Gradient, create an account and get the workspace id and access token (we them later on in this tutorial)"},{"cell_type":"markdown","id":"db980f79-6306-47a3-a6c5-d2ca2dc9f662","metadata":{"language":"python"},"source":"Let's get started with installing gradientai"},{"cell_type":"code","execution_count":null,"id":"23b30676-3a56-4f17-82e0-4ab87aa93126","metadata":{"language":"python","trusted":true},"outputs":[],"source":"!pip install gradientai --upgrade"},{"cell_type":"markdown","id":"b7014cb0-c001-4e0e-af87-137161d71620","metadata":{"language":"python"},"source":"## Add the Gradient workspace id and access key "},{"cell_type":"markdown","id":"827dd810-fa24-45a4-adb4-26048d0530c9","metadata":{"language":"python"},"source":"The workspace id and access key are present in the dashboard once you signin to your Gradient account. "},{"cell_type":"code","execution_count":null,"id":"4d6f8f69-a479-4406-ab62-bb7577cafdde","metadata":{"language":"python","trusted":true},"outputs":[],"source":"import os\nos.environ['GRADIENT_WORKSPACE_ID']=''\nos.environ['GRADIENT_ACCESS_TOKEN']=''"},{"cell_type":"markdown","id":"2703299f-51c2-422a-96fd-c71b4f9facb8","metadata":{"language":"python"},"source":"## Let's Fine-Tune the Base Model"},{"cell_type":"markdown","id":"bafc9171-bdc1-46bb-b299-2bc34e7c500a","metadata":{"language":"python"},"source":"In the available options from Gradient, let's fine-tune the base model (nous-hermes2). Using my own example here, I am Pavan Belagatti and a developer evangelist at SingleStore. If you see below, I am adding samples on my own to make the base model understand more context about me. Once you run this, the model's response before fine-tuning will be a very hallucinated answer. Once the model is trained on our custom data, the output you see after fine-tuning will be a complete relevant and contextually correct answer. I have purposely taken my own name as an example here to show how you can fine-tune the models accordingly. "},{"cell_type":"code","execution_count":null,"id":"c5e4f8f4-2b65-483b-a12d-7acf36e866c1","metadata":{"language":"python","trusted":true},"outputs":[],"source":"from gradientai import Gradient\n\n\ndef main():\n gradient = Gradient()\n\n base_model = gradient.get_base_model(base_model_slug=\"nous-hermes2\")\n\n new_model_adapter = base_model.create_model_adapter(\n name=\"Pavanmodel\"\n )\n print(f\"Created model adapter with id {new_model_adapter.id}\")\n\n\n sample_query = \"### Instruction: Who is Pavan Belagatti? \\n\\n ### Response:\"\n print(f\"Asking: {sample_query}\")\n ## Before Finetuning\n completion = new_model_adapter.complete(query=sample_query, max_generated_token_count=100).generated_output\n print(f\"Generated(before fine tuning): {completion}\")\n\n samples=[\n {\"inputs\":\"### Instruction: Who is Pavan Belagatti? \\n\\n### Response: Pavan is an award-winning developer advocate and is an author at various technology publications\"},\n {\"inputs\":\"### Instruction: Who is this person named Pavan Belagatti? \\n\\n### Response: Pavan Belagatti Likes Gen AI, DevOps and Data Science and he likes creating technology articles\"},\n {\"inputs\":\"### Instruction: What do you know about Pavan Belagatti? \\n\\n### Response: Pavan Belagatti is a popular article creator who specializes in the field of Gen AI, DevOps, Data Science and he is very active on LinkedIn\"},\n {\"inputs\":\"### Instruction: Can you tell me about Pavan Belagatti? \\n\\n### Response: Pavan Belagatti is a developer evangelist,content creator,and he likes creating Gen AI, DevOps and Data Science articles\"}\n ]\n\n ## Lets define parameters for finetuning\n num_epochs=3\n count=0\n while count<num_epochs:\n print(f\"Fine tuning the model with iteration {count + 1}\")\n new_model_adapter.fine_tune(samples=samples)\n count=count+1\n\n #after fine tuning\n completion = new_model_adapter.complete(query=sample_query, max_generated_token_count=100).generated_output\n print(f\"Generated(after fine tuning): {completion}\")\n new_model_adapter.delete()\n gradient.close()\n\nif __name__ == \"__main__\":\n main()"},{"cell_type":"markdown","id":"52d674c8-3ca0-490e-8f83-9c9e7904471a","metadata":{"language":"python"},"source":"<div id=\"singlestore-footer\" style=\"background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px\"></div>\n<div><img src=\"https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png\" style=\"padding: 0px; margin: 0px; height: 24px\"/></div>"}],"metadata":{"jupyterlab":{"notebooks":{"version_major":6,"version_minor":4}},"kernelspec":{"display_name":"Python 3 (ipykernel)","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.11.6"},"singlestore_cell_default_language":"python","singlestore_connection":{}},"nbformat":4,"nbformat_minor":5}
Loading