LLM-Powered Feature Extraction for Machine Learning Pipelines with Ollama Structured Outputs
This tutorial shows you how to extract machine learning features from text data with structured outputs using large language models in Python. The tech stack includes open-source AI tools such as Ollama and Meta's Llama3.2 large language model, as well as the pandas and gnews libraries.
As an example, the tutorial shows how you could filter for future-looking stock sentiment from news article headlines when creating financial sentiment scores.
Structured output is still an error prone exercise as new libraries attempt to wrangle AI model output into structured formats, but if you can figure out how to master it, integrate it into your machine learning workflows, and even automate the creation of ML features and datasets, you will have a competitive advantage as a machine learning engineer, AI engineer, or data scientist.
Like (👍), comment, and subscribe to the Deep Charts channel for more tutorials on the latest AI, machine learning, and data science tools.
*Subscribe to the Deep Charts Newsletter*
https://deepcharts.substack.com/
****Important Note: This video is not financial or investing advice. It is an educational tutorial on how to use AI/LLM vision models in Python. Also, don't blindly trust the results of LLM model results without critical thinking or subject matter expertise 🧠. LLM's are still experimental technology that have high error rates.****
**Full Code**
Github: https://github.com/deepcharts/projects/blob/main/structured_outputs.ipynb
**Resources**
Ollama: https://ollama.com/
Llama 3.2 Model: https://ollama.com/library/llama3.2
**Chapters**
0:00 Why should you use Structured Outputs?
0:35 Tech Stack for generating Structured Outputs with Open Source tools and models
0:56 Use Case: Filtering for Future-Looking Financial Sentiment in News Article Headlines
1:49 Ollama Installation, Llama3.2 Installation, and Python Environment Setup
2:28 Pulling news headline data with the gnews library
2:55 Initializing the LLM Model to extract Structured Outputs
4:34 Assessing the results
This tutorial shows you how to extract machine learning features from text data with structured outputs using large language models in Python. The tech stack includes open-source AI tools such as Ollama and Meta’s Llama3.2 large language model, as well as the pandas and gnews libraries.
As an example, the tutorial shows how you could filter for future-looking stock sentiment from news article headlines when creating financial sentiment scores.
Structured output is still an error prone exercise as new libraries attempt to wrangle AI model output into structured formats, but if you can figure out how to master it, integrate it into your machine learning workflows, and even automate the creation of ML features and datasets, you will have a competitive advantage as a machine learning engineer, AI engineer, or data scientist.
Like (👍), comment, and subscribe to the Deep Charts channel for more tutorials on the latest AI, machine learning, and data science tools.
*Subscribe to the Deep Charts Newsletter*
https://deepcharts.substack.com/
****Important Note: This video is not financial or investing advice. It is an educational tutorial on how to use AI/LLM vision models in Python. Also, don’t blindly trust the results of LLM model results without critical thinking or subject matter expertise 🧠. LLM’s are still experimental technology that have high error rates.****
**Full Code**
Github: https://github.com/deepcharts/projects/blob/main/structured_outputs.ipynb
**Resources**
Ollama: https://ollama.com/
Llama 3.2 Model: https://ollama.com/library/llama3.2
**Chapters**
0:00 Why should you use Structured Outputs?
0:35 Tech Stack for generating Structured Outputs with Open Source tools and models
0:56 Use Case: Filtering for Future-Looking Financial Sentiment in News Article Headlines
1:49 Ollama Installation, Llama3.2 Installation, and Python Environment Setup
2:28 Pulling news headline data with the gnews library
2:55 Initializing the LLM Model to extract Structured Outputs
4:34 Assessing the results