Quick Start Guide

This guide will get you up and running with Prompture in just a few minutes. After completing the Installation Guide guide, you’re ready to start extracting structured data from text using LLMs.

Basic Setup

First, make sure you have your API keys configured in a .env file:

# .env file
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here

Import Prompture in your Python script:

from prompture import extract_and_jsonify, extract_with_model, field_from_registry
from pydantic import BaseModel
from typing import Optional

Your First Extraction

Let’s start with a simple example extracting person information from text:

from prompture import extract_and_jsonify

# Define what data you want to extract
fields = {
    "name": "string",
    "age": "integer",
    "occupation": "string"
}

# Text containing the information
text = "Sarah Johnson is a 32-year-old software engineer at TechCorp."

# Extract structured data
result = extract_and_jsonify(
    prompt=text,
    fields=fields,
    model_name="openai/gpt-3.5-turbo"
)

print(result)
# Output: {"name": "Sarah Johnson", "age": 32, "occupation": "software engineer"}

Using Field Definitions

Prompture provides a powerful field definitions system with built-in validation and descriptions. Here’s how to use pre-defined fields:

from prompture import extract_and_jsonify

# Use built-in field definitions
fields = {
    "name": "name",           # Built-in name field with validation
    "age": "age",             # Built-in age field (integer, 0-150)
    "email": "email",         # Built-in email field with format validation
    "phone": "phone"          # Built-in phone field
}

text = """
Contact: John Doe, 28 years old
Email: john.doe@company.com
Phone: +1-555-123-4567
"""

result = extract_and_jsonify(
    prompt=text,
    fields=fields,
    model_name="openai/gpt-4"
)

print(result)

Custom Field Definitions

You can register your own field definitions for reusable, validated fields:

from prompture import register_field, field_from_registry, extract_with_model
from pydantic import BaseModel
from typing import List, Optional

# Register custom fields
register_field("skills", {
    "type": "list",
    "description": "List of professional skills and competencies",
    "instructions": "Extract skills as a list of strings",
    "default": [],
    "nullable": True
})

register_field("experience_years", {
    "type": "int",
    "description": "Years of professional experience",
    "instructions": "Extract total years of work experience",
    "default": 0,
    "nullable": True
})

# Use custom fields in a model
class Professional(BaseModel):
    name: str = field_from_registry("name")
    skills: Optional[List[str]] = field_from_registry("skills")
    experience_years: Optional[int] = field_from_registry("experience_years")
    occupation: Optional[str] = field_from_registry("occupation")

# Extract professional profile
text = """
Michael Chen has 8 years of experience as a data scientist.
His skills include Python, machine learning, SQL, and data visualization.
"""

result = extract_with_model(
    model_class=Professional,
    prompt=text,
    model_name="openai/gpt-4"
)

print(f"Professional: {result.name}")
print(f"Skills: {', '.join(result.skills)}")
print(f"Experience: {result.experience_years} years")

Different LLM Providers

Prompture supports multiple LLM providers. Simply change the model_name parameter:

from prompture import extract_and_jsonify

fields = {"name": "name", "age": "age"}
text = "Emma Watson, 33 years old"

# OpenAI GPT models
result1 = extract_and_jsonify(text, fields, model_name="openai/gpt-4")
result2 = extract_and_jsonify(text, fields, model_name="openai/gpt-3.5-turbo")

# Anthropic Claude models
result3 = extract_and_jsonify(text, fields, model_name="anthropic/claude-3-haiku-20240307")
result4 = extract_and_jsonify(text, fields, model_name="anthropic/claude-3-sonnet-20240229")

# Google Gemini models
result5 = extract_and_jsonify(text, fields, model_name="google/gemini-pro")

# Groq models (fast inference)
result6 = extract_and_jsonify(text, fields, model_name="groq/llama2-70b-4096")

# Local models via Ollama
result7 = extract_and_jsonify(text, fields, model_name="ollama/llama2")

Template Variables

Prompture supports template variables in field definitions for dynamic defaults:

from prompture import register_field, field_from_registry, extract_with_model
from pydantic import BaseModel

# Register field with template variables
register_field("processed_at", {
    "type": "str",
    "description": "When this data was processed",
    "instructions": "Use {{current_datetime}} for processing timestamp",
    "default": "{{current_datetime}}",
    "nullable": False
})

register_field("document_year", {
    "type": "int",
    "description": "Year of the document",
    "instructions": "Extract year, use {{current_year}} if not specified",
    "default": "{{current_year}}",
    "nullable": False
})

class Document(BaseModel):
    title: str = field_from_registry("title")
    document_year: int = field_from_registry("document_year")
    processed_at: str = field_from_registry("processed_at")

text = "Annual Report: Company Performance Review"

result = extract_with_model(
    model_class=Document,
    prompt=text,
    model_name="openai/gpt-4"
)

print(f"Document: {result.title}")
print(f"Year: {result.document_year}")  # Will use current year if not found
print(f"Processed: {result.processed_at}")  # Current datetime

Error Handling

Prompture provides built-in error handling and validation:

from prompture import extract_and_jsonify, validate_against_schema
import json

try:
    result = extract_and_jsonify(
        prompt="Invalid text with no clear data",
        fields={"name": "name", "age": "age"},
        model_name="openai/gpt-4"
    )

    # Validate the result
    schema = {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "age": {"type": "integer", "minimum": 0, "maximum": 150}
        },
        "required": ["name", "age"]
    }

    is_valid = validate_against_schema(result, schema)
    if is_valid:
        print("✅ Valid result:", result)
    else:
        print("❌ Invalid result format")

except Exception as e:
    print(f"❌ Extraction failed: {e}")

Batch Processing

For processing multiple texts, you can use a loop or batch approach:

from prompture import extract_with_model
from pydantic import BaseModel
from typing import Optional

class Contact(BaseModel):
    name: str = field_from_registry("name")
    email: Optional[str] = field_from_registry("email")
    phone: Optional[str] = field_from_registry("phone")

# Multiple text samples
texts = [
    "John Smith - john@company.com - (555) 123-4567",
    "Alice Johnson, email: alice.j@startup.io, phone: +1-555-987-6543",
    "Bob Wilson | bwilson@corp.com | 555.111.2222"
]

results = []
for text in texts:
    try:
        contact = extract_with_model(
            model_class=Contact,
            prompt=text,
            model_name="openai/gpt-3.5-turbo"
        )
        results.append(contact)
    except Exception as e:
        print(f"Failed to extract from '{text}': {e}")

for contact in results:
    print(f"Name: {contact.name}, Email: {contact.email}")

Configuration Tips

Environment Variables

Keep API keys in .env files and never commit them to version control.

Model Selection
  • Use gpt-3.5-turbo for fast, cost-effective extraction

  • Use gpt-4 for complex or nuanced extraction tasks

  • Use claude-3-haiku for fast Anthropic processing

  • Use local models (Ollama) for privacy or offline use

Field Definitions
  • Use built-in fields when possible for consistency

  • Register custom fields for domain-specific data

  • Include clear descriptions and instructions in field definitions

Error Handling
  • Always wrap extraction calls in try-catch blocks

  • Validate results when data quality is critical

  • Use nullable fields for optional data

Next Steps

Now that you’ve learned the basics, explore:

  • Examples - More comprehensive examples and use cases

  • field_definitions - Advanced field definition techniques

  • drivers - Working with different LLM providers

  • API Reference - Complete API reference

For practical examples with different LLM providers and complex extraction scenarios, see the Examples section.