API Documentation
Schema Documentation
Advanced Features
JSON Schema offers several advanced features that allow for more sophisticated data extraction. In this section, we'll cover optional values, default values, formats, and more.
Optional Properties
By default, all properties in a JSON Schema are optional. The AI will only include them in the output if it finds relevant information in the input text.
You can explicitly specify which properties are required using the required
array at the object level.
{ "type": "object", "properties": { "name": { "type": "string", "description": "The product name" }, "price": { "type": "number", "description": "The product price" }, "brand": { "type": "string", "description": "The brand name" }, "color": { "type": "string", "description": "The product color" } }, "required": ["name", "price"] }
Example Input:
The red Premium Blender costs $199.99.
Example Output:
{ "name": "Premium Blender", "price": 199.99, "color": "red" }
Pro Tip:
Only mark properties as required when they are truly necessary. This gives the AI more flexibility when extracting data from varied text inputs.
Default Values
You can specify default values for properties that will be used when the AI cannot find the information in the input text.
{ "type": "object", "properties": { "name": { "type": "string", "description": "The product name" }, "price": { "type": "number", "description": "The product price" }, "currency": { "type": "string", "description": "The currency code", "default": "USD" }, "inStock": { "type": "boolean", "description": "Whether the product is in stock", "default": true } } }
Example Input:
The Premium Blender costs $199.99.
Example Output:
{ "name": "Premium Blender", "price": 199.99, "currency": "USD", "inStock": true }
Note:
Default values are only used when the AI cannot find or infer the value from the input text. If there's any information in the text that could be used for the property, the AI will use that instead of the default.
Formats
The format
keyword provides additional information about the expected format of string values. This helps the AI extract data in the correct format.
{ "type": "object", "properties": { "name": { "type": "string", "description": "The person's name" }, "email": { "type": "string", "description": "The person's email address", "format": "email" }, "website": { "type": "string", "description": "The person's website", "format": "uri" }, "birthDate": { "type": "string", "description": "The person's date of birth", "format": "date" } } }
Example Input:
John Smith was born on November 15, 1985. You can reach him at [email protected] or visit his site at https://johnsmith.com.
Example Output:
{ "name": "John Smith", "email": "[email protected]", "website": "https://johnsmith.com", "birthDate": "1985-11-15" }
Common string formats supported by JSON Anything include:
date
: ISO 8601 date format (YYYY-MM-DD)date-time
: ISO 8601 date-time formatemail
: Email addresshostname
: Internet host nameipv4
: IPv4 addressipv6
: IPv6 addressuri
: URI/URLphone
: Phone number (custom format)currency
: Currency value (custom format)
Minimum and Maximum Constraints
You can specify constraints on numeric values using minimum
, maximum
, exclusiveMinimum
, and exclusiveMaximum
.
{ "type": "object", "properties": { "name": { "type": "string", "description": "The product name" }, "price": { "type": "number", "description": "The product price", "minimum": 0 }, "rating": { "type": "number", "description": "The product rating", "minimum": 1, "maximum": 5 }, "percentDiscount": { "type": "number", "description": "The discount percentage", "minimum": 0, "maximum": 100 } } }
Example Input:
The Premium Blender is priced at $199.99 with a 15% discount. It has a customer rating of 4.7 stars.
Example Output:
{ "name": "Premium Blender", "price": 199.99, "rating": 4.7, "percentDiscount": 15 }
Pro Tip:
Using constraints helps the AI understand the expected range of values, which can improve extraction accuracy, especially when dealing with ambiguous text.
String Length Constraints
You can set minimum and maximum length constraints for string values.
{ "type": "object", "properties": { "username": { "type": "string", "description": "The user's username", "minLength": 3, "maxLength": 20 }, "countryCode": { "type": "string", "description": "The two-letter country code", "minLength": 2, "maxLength": 2 }, "bio": { "type": "string", "description": "User's short biography", "maxLength": 200 } } }
Example Input:
John_Doe123 is from the United States (US) and describes himself as a software developer passionate about AI and machine learning.
Example Output:
{ "username": "John_Doe123", "countryCode": "US", "bio": "A software developer passionate about AI and machine learning." }
Pattern Matching
The pattern
keyword allows you to specify a regular expression that a string value must match.
{ "type": "object", "properties": { "name": { "type": "string", "description": "The person's name" }, "zipCode": { "type": "string", "description": "The US ZIP code", "pattern": "^\d{5}(-\d{4})?$" }, "phoneNumber": { "type": "string", "description": "The US phone number", "pattern": "^\(\d{3}\) \d{3}-\d{4}$" } } }
Example Input:
Contact Sarah Johnson at (555) 123-4567. She lives in Seattle, WA 98101.
Example Output:
{ "name": "Sarah Johnson", "zipCode": "98101", "phoneNumber": "(555) 123-4567" }
Note:
The AI will attempt to format extracted values according to the specified pattern, but this may not always be perfect. Use clear descriptions along with patterns for best results.
Combining Multiple Validation Keywords
You can combine multiple validation keywords to define precise constraints on your data.
{ "type": "object", "properties": { "name": { "type": "string", "description": "The product name", "minLength": 2, "maxLength": 50 }, "price": { "type": "number", "description": "The product price", "minimum": 0, "exclusiveMinimum": true }, "category": { "type": "string", "description": "The product category", "enum": ["electronics", "clothing", "home", "books", "other"] }, "tags": { "type": "array", "description": "Product tags", "items": { "type": "string" }, "minItems": 1, "maxItems": 5, "uniqueItems": true }, "releaseDate": { "type": "string", "description": "The product release date", "format": "date" } }, "required": ["name", "price", "category"] }
Example Input:
The Ultra HD Smart TV is a new electronics product that will be released on January 15, 2024. It is priced at $899.99 and tagged as 4K, smart, LED, and HDR.
Example Output:
{ "name": "Ultra HD Smart TV", "price": 899.99, "category": "electronics", "tags": [ "4K", "smart", "LED", "HDR" ], "releaseDate": "2024-01-15" }
Custom Vocabularies
In some cases, you may want to provide custom vocabularies to help the AI recognize specific terms or concepts in your domain.
{ "type": "object", "properties": { "diagnosis": { "type": "string", "description": "The medical diagnosis", "vocabulary": [ "hypertension", "type 2 diabetes", "asthma", "migraine", "chronic kidney disease", "rheumatoid arthritis" ] }, "severity": { "type": "string", "description": "The severity level", "enum": ["mild", "moderate", "severe"] } } }
Example Input:
The patient was diagnosed with moderate asthma.
Example Output:
{ "diagnosis": "asthma", "severity": "moderate" }
Note:
The vocabulary
keyword is a custom extension in JSON Anything and not part of the standard JSON Schema. It helps the AI identify specific terms but is not as strict as enum
- it will still accept terms not in the vocabulary list.