Contributing to Prompture

We welcome contributions to Prompture! This guide will help you get started with contributing to the project.

Getting Started

Prerequisites

  • Python 3.8 or higher

  • Git for version control

  • A GitHub account

Setting Up Development Environment

  1. Fork the Repository

    Fork the Prompture repository on GitHub to your account.

  2. Clone Your Fork

    git clone https://github.com/your-username/prompture.git
    cd prompture
    
  3. Create a Virtual Environment

    python -m venv prompture-dev
    source prompture-dev/bin/activate  # On Windows: prompture-dev\Scripts\activate
    
  4. Install in Development Mode

    pip install -e ".[dev]"
    
  5. Install Pre-commit Hooks (if available)

    pre-commit install
    

Types of Contributions

We accept several types of contributions:

Bug Reports

If you find a bug, please create an issue with:

  • Clear description of the problem

  • Steps to reproduce the issue

  • Expected vs actual behavior

  • System information (Python version, OS, etc.)

  • Code example that demonstrates the issue

Feature Requests

For new features, please:

  • Check existing issues to avoid duplicates

  • Describe the use case and why it’s needed

  • Provide examples of how it would be used

  • Consider backward compatibility

Documentation Improvements

Documentation contributions are highly valued:

  • Fix typos and grammar errors

  • Improve clarity of existing documentation

  • Add examples and use cases

  • Update outdated information

Code Contributions

Code contributions should:

  • Follow coding standards (see below)

  • Include tests for new functionality

  • Update documentation as needed

  • Maintain backward compatibility when possible

Development Workflow

  1. Create a Branch

    git checkout -b feature/your-feature-name
    # or
    git checkout -b bugfix/issue-description
    
  2. Make Your Changes

    Write your code following the project standards.

  3. Run Tests

    pytest tests/
    
  4. Update Documentation

    Update relevant documentation files and docstrings.

  5. Commit Your Changes

    git add .
    git commit -m "feat: add new extraction method for PDFs"
    
  6. Push to Your Fork

    git push origin feature/your-feature-name
    
  7. Create Pull Request

    Open a pull request from your fork to the main repository.

Coding Standards

Code Style

  • Follow PEP 8 Python style guidelines

  • Use type hints for function parameters and return values

  • Write clear, descriptive variable names

  • Keep line length under 100 characters

# Good example
def extract_person_data(text: str, model_name: str = "openai/gpt-4") -> Dict[str, Any]:
    """Extract person information from text using specified model."""
    pass

# Bad example
def extract(txt, m="gpt4"):
    pass

Documentation

  • Write comprehensive docstrings for all public functions and classes

  • Use Google-style docstrings

  • Include examples in docstrings when helpful

def register_field(name: str, definition: Dict[str, Any]) -> None:
    """Register a custom field definition in the global registry.

    Args:
        name: Unique identifier for the field
        definition: Field specification dictionary containing type,
            description, instructions, and validation rules

    Raises:
        ValueError: If field definition is invalid
        KeyError: If required definition keys are missing

    Example:
        >>> register_field("skills", {
        ...     "type": "list",
        ...     "description": "List of professional skills",
        ...     "instructions": "Extract as comma-separated values",
        ...     "default": []
        ... })
    """

Testing Guidelines

Test Structure

  • Place tests in the tests/ directory

  • Name test files with test_ prefix

  • Group related tests in test classes

  • Use descriptive test method names

import pytest
from prompture import extract_and_jsonify, register_field

class TestFieldRegistration:
    """Test field registration functionality."""

    def test_register_custom_field_success(self):
        """Test successful registration of custom field."""
        register_field("test_field", {
            "type": "str",
            "description": "Test field for unit testing"
        })
        # Add assertions here

    def test_register_field_invalid_type_raises_error(self):
        """Test that invalid field type raises ValueError."""
        with pytest.raises(ValueError):
            register_field("bad_field", {"type": "invalid_type"})

Running Tests

# Run all tests
pytest

# Run specific test file
pytest tests/test_field_definitions.py

# Run with coverage
pytest --cov=prompture tests/

# Run with verbose output
pytest -v

Test Coverage

  • Aim for high test coverage (>80%)

  • Test both success and failure cases

  • Include edge cases and boundary conditions

  • Mock external API calls in tests

Adding New LLM Drivers

To add support for a new LLM provider:

  1. Create Driver Class

    Create a new file in prompture/drivers/ following the pattern:

    from prompture.driver import Driver
    import requests
    from typing import Dict, Any
    
    class NewProviderDriver(Driver):
        """Driver for NewProvider LLM service."""
    
        def __init__(self, model_name: str, **kwargs):
            super().__init__(model_name, **kwargs)
            self.api_key = kwargs.get('api_key') or os.getenv('NEWPROVIDER_API_KEY')
            self.base_url = kwargs.get('base_url', 'https://api.newprovider.com/v1')
    
        def ask_for_json(self, prompt: str, **kwargs) -> Dict[str, Any]:
            """Send prompt to NewProvider API and return JSON response."""
            # Implement API call logic here
            pass
    
  2. Update Driver Registry

    Add your driver to the driver registry in prompture/drivers/__init__.py.

  3. Add Tests

    Create comprehensive tests for your driver in tests/test_drivers.py.

  4. Update Documentation

    Add documentation for the new driver including setup instructions and examples.

Adding New Field Types

To add built-in field definitions:

  1. Add Field Definition

    Add the field to prompture/field_definitions.py:

    FIELD_DEFINITIONS["new_field_type"] = {
        "type": "str",
        "description": "Description of the new field type",
        "instructions": "Instructions for LLM on how to extract this field",
        "default": "default_value",
        "nullable": True,
        "validation": {
            "pattern": r"^[A-Z][a-z]+$"  # Optional regex validation
        }
    }
    
  2. Add Tests

    Test the field definition in tests/test_field_definitions.py.

  3. Update Documentation

    Document the new field type in the API documentation.

Documentation Contributions

Building Documentation

To build the documentation locally:

cd docs/
pip install -r requirements.txt
make html  # On Windows: make.bat html

The built documentation will be in docs/_build/html/.

Documentation Standards

  • Use reStructuredText format for documentation files

  • Include practical examples for all features

  • Keep language clear and concise

  • Use proper cross-references with :doc: and :ref: directives

See the :doc:`quickstart` guide for basic usage examples.

For more details, refer to :func:`prompture.extract_and_jsonify`.

Pull Request Guidelines

Creating Good Pull Requests

  • Use descriptive titles that explain what the PR does

  • Write detailed descriptions explaining the changes and why they were made

  • Reference related issues using Fixes #123 or Closes #456

  • Keep PRs focused - one feature or fix per PR

  • Update relevant documentation

Pull Request Template

When creating a PR, include:

## Description
Brief description of changes made.

## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Documentation update
- [ ] Performance improvement

## Testing
- [ ] Added tests for new functionality
- [ ] All existing tests pass
- [ ] Manual testing completed

## Documentation
- [ ] Updated relevant documentation
- [ ] Added docstrings for new functions
- [ ] Updated API reference if needed

Code Review Process

All contributions go through code review:

  1. Automated checks run on your PR (tests, linting, etc.)

  2. Maintainer review for code quality and design

  3. Feedback incorporation - address review comments

  4. Final approval and merge

Common Review Feedback

  • Add type hints for better code clarity

  • Improve test coverage for edge cases

  • Update documentation to reflect changes

  • Follow naming conventions consistently

  • Handle errors gracefully with proper exception handling

Release Process

Prompture follows semantic versioning (SemVer):

  • Major version (1.0.0) - Breaking changes

  • Minor version (0.1.0) - New features, backward compatible

  • Patch version (0.0.1) - Bug fixes, backward compatible

Releases are managed by maintainers and include:

  • Updated version numbers

  • Changelog entries

  • Git tags

  • PyPI package uploads

Getting Help

If you need help with contributing:

  • Check existing documentation and examples first

  • Search existing issues for similar problems

  • Create a discussion for general questions

  • Join community channels (if available)

  • Tag maintainers in issues for urgent matters

Recognition

Contributors are recognized in:

  • CONTRIBUTORS.md file listing all contributors

  • Release notes highlighting major contributions

  • Documentation credits for documentation contributors

Thank you for contributing to Prompture! Your contributions help make structured LLM data extraction better for everyone.