by UnityAI
Image-to-Text Description API Endpoint
import logging
from typing import List
from fastapi import FastAPI, File, UploadFile, HTTPException
from fastapi.responses import RedirectResponse
from abilities import apply_sqlite_migrations, llm
from models import Base, engine
import io
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
app = FastAPI()
ALLOWED_IMAGE_TYPES = ["image/jpeg", "image/png"]
@app.get("/", include_in_schema=False)
def root():
return RedirectResponse(url="/docs")
@app.post("/analyze-image")
async def analyze_image(file: UploadFile = File(...)):
try:
# Validate file type
if file.content_type not in ALLOWED_IMAGE_TYPES:
Frequently Asked Questions
What are some potential business applications for this Image-to-Text Description API?
The Image-to-Text Description API has numerous business applications across various industries. Some potential use cases include: - E-commerce: Automatically generating product descriptions from images - Social media: Improving accessibility by providing alt text for images - Content management: Organizing and categorizing large image databases - Marketing: Analyzing visual content for brand consistency and messaging - Real estate: Describing property images for listings
How can this API improve user experience in digital products?
The Image-to-Text Description API can significantly enhance user experience by: - Making content more accessible to visually impaired users - Enabling voice-based interactions with visual content - Improving search functionality by making image content searchable - Providing quick summaries of visual information - Enhancing content recommendations based on image analysis
What industries could benefit most from implementing this Image-to-Text Description API?
Several industries can benefit from this API, including: - Media and publishing: Automating image captioning for news articles and blogs - Healthcare: Describing medical images for quick reference or patient communication - Education: Making visual learning materials more accessible - Tourism: Generating descriptions for travel photos and landmarks - Retail: Improving product discovery and recommendations based on visual attributes
How can I customize the LLM prompt in the Image-to-Text Description API to get more specific results?
You can customize the LLM prompt by modifying the prompt
parameter in the llm
function call. For example, if you want to focus on describing colors and shapes, you could change the code like this:
python
result = llm(
prompt="Please describe the main colors and shapes present in this image.",
response_schema=response_schema,
image_url=image_url,
model="gpt-4o",
temperature=0.7
)
You can adjust the prompt to focus on any specific aspects of the image you're interested in analyzing.
How can I extend the Image-to-Text Description API to handle multiple images in a single request?
To handle multiple images, you can modify the /analyze-image
endpoint to accept a list of files. Here's an example of how you could update the code:
```python from fastapi import FastAPI, File, UploadFile, HTTPException from typing import List
@app.post("/analyze-images") async def analyze_images(files: List[UploadFile] = File(...)): results = [] for file in files: # Validate and process each file as in the original code # Append the result for each image to the results list # ...
return {
"status": "success",
"results": results
}
```
This modification allows the API to process multiple images in a single request, returning descriptions for all uploaded images.
Created: | Last Updated:
Here's a step-by-step guide on how to use the Image-to-Text Description API Endpoint template:
Introduction
This template provides an API endpoint for uploading images and generating AI-based text descriptions of the image content. It uses FastAPI to create a simple server that accepts image uploads and leverages an AI model to analyze and describe the images.
Getting Started
-
Click "Start with this Template" to begin using this template in the Lazy Builder interface.
-
Press the "Test" button to initiate the deployment of the app and launch the Lazy CLI.
Using the API
Once the app is deployed, you'll receive a dedicated server link to access the API. Additionally, you'll get a link to the FastAPI documentation (usually ending with /docs
), which provides an interactive interface to test the API.
Uploading an Image
To use the API, you need to send a POST request to the /analyze-image
endpoint with an image file. Here's a sample request using cURL:
bash
curl -X POST "https://your-api-url/analyze-image" -H "accept: application/json" -H "Content-Type: multipart/form-data" -F "file=@path/to/your/image.jpg"
Replace https://your-api-url
with the actual URL provided by Lazy, and path/to/your/image.jpg
with the path to the image you want to analyze.
Sample Response
The API will return a JSON response with the image description. Here's an example:
json
{
"status": "success",
"description": "The image shows a serene landscape with a calm lake reflecting the surrounding mountains. In the foreground, there's a wooden dock extending into the water. The sky is clear with a few wispy clouds, and the overall scene suggests a peaceful, early morning atmosphere in a mountainous region."
}
Integrating the API
To integrate this API into your application or service, you'll need to make HTTP POST requests to the provided API endpoint. Here's an example of how you might do this using Python and the requests
library:
```python import requests
def analyze_image(image_path, api_url): with open(image_path, 'rb') as image_file: files = {'file': image_file} response = requests.post(f"{api_url}/analyze-image", files=files)
if response.status_code == 200:
result = response.json()
return result['description']
else:
return f"Error: {response.status_code} - {response.text}"
Usage
api_url = "https://your-api-url" # Replace with your actual API URL image_path = "path/to/your/image.jpg" description = analyze_image(image_path, api_url) print(description) ```
Remember to replace "https://your-api-url"
with the actual URL provided by Lazy for your deployed API.
By following these steps, you can easily use and integrate the Image-to-Text Description API Endpoint into your projects.
Here are 5 key business benefits for this Image-to-Text Description API Endpoint template:
Template Benefits
-
Enhanced Content Accessibility: This API can automatically generate descriptive text for images, making visual content more accessible to visually impaired users and improving overall web accessibility compliance.
-
Automated Image Cataloging: Businesses with large image databases can use this API to automatically generate descriptive tags and metadata, streamlining image organization and search functionality.
-
Improved SEO Performance: By providing accurate, AI-generated descriptions for images, websites can boost their SEO performance, as search engines can better understand and index image content.
-
Content Moderation Assistance: The API can be used to automatically screen and flag potentially inappropriate or sensitive image content, aiding in content moderation processes for social media platforms or user-generated content sites.
-
E-commerce Product Description Generation: Online retailers can utilize this API to automatically generate detailed product descriptions from product images, saving time and ensuring consistency in product listings.