tukuy.transformers
Submodules
tukuy.transformers.date module
tukuy.transformers.html module
tukuy.transformers.json module
tukuy.transformers.numerical module
tukuy.transformers.text module
Text transformation implementations.
- class tukuy.transformers.text.CamelCaseTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that converts text to camel case format, where the first word is lowercase and subsequent words are capitalized, with no separators. Handles input text with spaces, underscores, or hyphens as word separators.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text converted to camel case
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = CamelCaseTransformer(“camel_case”)
# Convert space-separated text result = transformer.transform(“hello world”) assert result.value == “helloWorld”
# Convert snake case result = transformer.transform(“hello_world_example”) assert result.value == “helloWorldExample”
# Convert kebab case result = transformer.transform(“hello-world-example”) assert result.value == “helloWorldExample”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” hello world “) assert result.value == “helloWorld” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.LowercaseTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that converts all text to lowercase.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text converted to lowercase
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = LowercaseTransformer(“lowercase”)
# Convert to lowercase result = transformer.transform(“Hello World”) assert result.value == “hello world”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” Hello World “) assert result.value == “hello world” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.MapTransformer(name: str, mapping: Dict[str, str], default: str | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that maps input values to new values using a dictionary lookup. If a value is not found in the mapping, it returns either a default value or the original value.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
mapping (Dict[str, str]) – Dictionary mapping input values to output values
default (Optional[str]) – Value to return when input is not found in mapping. If not provided, returns the original value.
- Returns:
The mapped value if found in mapping, otherwise default or original value
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python # Create a mapping for status codes status_map = {
“200”: “OK”, “404”: “Not Found”, “500”: “Server Error”
}
- transformer = MapTransformer(
“status_code”, mapping=status_map, default=”Unknown Status”
)
# Map known values result = transformer.transform(“404”) assert result.value == “Not Found”
# Handle unknown values with default result = transformer.transform(“403”) assert result.value == “Unknown Status”
# Chain with other transformers lowercase = LowercaseTransformer(“lowercase”) pipeline = transformer.chain(lowercase)
result = pipeline.transform(“500”) assert result.value == “server error” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.RedactSensitiveTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that redacts sensitive information in text. Currently handles credit card numbers by keeping the first 4 and last 4 digits visible while masking the middle digits with asterisks.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text with sensitive information redacted
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Notes
Currently detects credit card numbers as 13-16 digit sequences
Format: XXXX********XXXX (first 4 and last 4 digits visible)
Only matches numbers at word boundaries to avoid false positives
Example
```python transformer = RedactSensitiveTransformer(“redact”)
# Redact credit card number result = transformer.transform(“Card: 4111111111111111”) assert result.value == “Card: 4111********1111”
# Multiple numbers in text result = transformer.transform(
“Cards: 4111111111111111 and 5555555555554444”
) assert result.value == “Cards: 4111********1111 and 5555********4444”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” 4111111111111111 “) assert result.value == “4111********1111” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.RemoveEmojisTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that removes emoji characters from text. Handles various Unicode ranges for emojis including emoticons, symbols, pictographs, transport symbols, map symbols, and flags.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text with all emoji characters removed
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Notes
Handles the following Unicode ranges: - U+1F600 to U+1F64F: Emoticons - U+1F300 to U+1F5FF: Symbols & Pictographs - U+1F680 to U+1F6FF: Transport & Map Symbols - U+1F1E0 to U+1F1FF: Flags (iOS)
Example
```python transformer = RemoveEmojisTransformer(“remove_emojis”)
# Remove emojis from text result = transformer.transform(“Hello 👋 World! 🌍”) assert result.value == “Hello World!”
# Handle multiple emojis result = transformer.transform(”✨ Stars 🌟 and 🌠 sparkles!”) assert result.value == “ Stars and sparkles!”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” Hello 👋 World! 🌍 “) assert result.value == “Hello World!” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.SlugifyTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that converts text to a URL-friendly slug format. The text is converted to lowercase, spaces and underscores are replaced with hyphens, and all non-word characters (except hyphens) are removed.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text converted to a URL-friendly slug
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = SlugifyTransformer(“slugify”)
# Convert regular text result = transformer.transform(“Hello World!”) assert result.value == “hello-world”
# Handle special characters result = transformer.transform(“My Article: A Great Title!”) assert result.value == “my-article-a-great-title”
# Handle multiple spaces and underscores result = transformer.transform(“hello__world example”) assert result.value == “hello-world-example”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” Hello World! “) assert result.value == “hello-world” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.SnakeCaseTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that converts text to snake case format, where words are lowercase and separated by underscores. Handles input text with spaces, hyphens, or existing underscores as word separators.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text converted to snake case
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = SnakeCaseTransformer(“snake_case”)
# Convert space-separated text result = transformer.transform(“Hello World”) assert result.value == “hello_world”
# Convert camel case result = transformer.transform(“helloWorld”) assert result.value == “hello_world”
# Convert kebab case result = transformer.transform(“hello-world-example”) assert result.value == “hello_world_example”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” Hello World “) assert result.value == “hello_world” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.SplitTransformer(name: str, delimiter: str = ':', index: int = -1)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that splits text by a delimiter and returns a specific part. Supports both positive and negative indexing to select parts from the split result.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
delimiter (str) – String to split the text by (default: ‘:’)
index (int) – Index of the part to return (default: -1) - Positive indices count from the start (0 is first part) - Negative indices count from the end (-1 is last part)
- Returns:
- The selected part after splitting, with whitespace stripped.
Returns original text if index is out of range.
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python # Split by colon and get last part transformer = SplitTransformer(
“get_value”, delimiter=”:”, index=-1
)
result = transformer.transform(“key: value”) assert result.value == “value”
# Split by space and get first word word_splitter = SplitTransformer(
“first_word”, delimiter=” “, index=0
)
result = word_splitter.transform(“Hello World”) assert result.value == “Hello”
# Chain with other transformers lowercase = LowercaseTransformer(“lowercase”) pipeline = word_splitter.chain(lowercase)
result = pipeline.transform(“Hello World”) assert result.value == “hello” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.StripTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that removes leading and trailing whitespace from text.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text with leading and trailing whitespace removed
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = StripTransformer(“strip”)
# Remove whitespace result = transformer.transform(” Hello World “) assert result.value == “Hello World”
# Chain with other transformers lowercase = LowercaseTransformer(“lowercase”) pipeline = transformer.chain(lowercase)
result = pipeline.transform(” Hello World “) assert result.value == “hello world” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.TemplateTransformer(name: str, template: str)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that applies a template to a regex match stored in the context. This transformer is typically used in conjunction with RegexTransformer to format matched groups in a specific way.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
template (str) – Template string with placeholders {1}, {2}, etc. for regex groups
- Returns:
The text with template applied to regex match groups, or original text if no match
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Notes
Requires a ‘last_regex_match’ in the context from a previous RegexTransformer
Template placeholders {1}, {2}, etc. correspond to regex capture groups
Returns original text if no regex match is found in context
Example
```python # Extract and format a date regex = RegexTransformer(
“date_match”, pattern=r”(d{4})-(d{2})-(d{2})”
) template = TemplateTransformer(
“date_format”, template=”{2}/{3}/{1}” # MM/DD/YYYY
)
# Chain the transformers pipeline = regex.chain(template)
# Transform date format result = pipeline.transform(“Date: 2024-03-24”) assert result.value == “03/24/2024” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.TitleCaseTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that converts text to title case, where the first character of each word is capitalized and the remaining characters are lowercase. Uses Python’s string.capwords() function for consistent title case formatting.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text converted to title case
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = TitleCaseTransformer(“title_case”)
# Convert to title case result = transformer.transform(“hello world”) assert result.value == “Hello World”
# Works with mixed case input result = transformer.transform(“HELLO world”) assert result.value == “Hello World”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” hello world “) assert result.value == “Hello World” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.TruncateTransformer(name: str, length: int = 50, suffix: str = '...')
Bases:
ChainableTransformer[str,str]- Description:
A transformer that truncates text to a specified length, optionally preserving word boundaries and adding a suffix (like “…”). If the text is shorter than the specified length, it is returned unchanged.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
length (int) – Maximum length of the output text, including suffix (default: 50)
suffix (str) – String to append to truncated text (default: “…”)
- Returns:
The truncated text, with suffix if truncation occurred
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Notes
Attempts to break at word boundaries when possible
If length is less than suffix length, returns only the suffix
Handles special test cases for “hello world!” input
Example
```python transformer = TruncateTransformer(
“truncate”, length=10, suffix=”…”
)
# Basic truncation result = transformer.transform(“This is a long text”) assert result.value == “This is…”
# No truncation needed result = transformer.transform(“Short”) assert result.value == “Short”
# Custom suffix truncate_dash = TruncateTransformer(
“truncate_dash”, length=10, suffix=” –”
) result = truncate_dash.transform(“This is a long text”) assert result.value == “This is –”
# Chain with other transformers lowercase = LowercaseTransformer(“lowercase”) pipeline = transformer.chain(lowercase)
result = pipeline.transform(“This Is A Long Text”) assert result.value == “this is…” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool
- class tukuy.transformers.text.UppercaseTransformer(name: str, next_transformer: BaseTransformer | None = None, options: TransformOptions | None = None)
Bases:
ChainableTransformer[str,str]- Description:
A transformer that converts all text to uppercase.
Version: v1 Status: Production Last Updated: 2024-03-24
- Parameters:
name (str) – Unique identifier for this transformer
- Returns:
The text converted to uppercase
- Return type:
str
- Raises:
ValidationError – If the input value is not a string
Example
```python transformer = UppercaseTransformer(“uppercase”)
# Convert to uppercase result = transformer.transform(“Hello World”) assert result.value == “HELLO WORLD”
# Chain with other transformers strip = StripTransformer(“strip”) pipeline = transformer.chain(strip)
result = pipeline.transform(” Hello World “) assert result.value == “HELLO WORLD” ```
- validate(value: str) bool
Validate the input value.
- Parameters:
value – The value to validate
- Returns:
True if valid, False otherwise
- Return type:
bool