Gemini 3.1 Flash-Lite

Model Information

Display Name: Gemini 3.1 Flash-Lite

API Model ID: google/gemini-3.1-flash-lite

Category: Image To Text

Description: Gemini 3.1 Flash-Lite is Google's most cost-efficient model in the 3.x lineup. Optimized for high-volume, cost-sensitive workloads with fast response times and a 1M context window. **Key Features:** - 1M token context window (1,048,576 tokens) - Up to 65K output tokens - Vision: text, image, video, audio, PDF input - Lightweight agentic workflows - Simple data extraction and classification - High throughput optimization **Capabilities:** - Fast text generation and chat - Data extraction and classification - Document summarization - Simple reasoning tasks - Multimodal processing (text, images, video) - Structured data generation - High-volume batch processing - Simple tool use **Best For:** - High-volume, low-latency applications - Simple data extraction and classification - Cost-effective inference at massive scale - Chat and conversational AI - Lightweight agentic tasks **Technical Specs:** - Model ID: gemini-3.1-flash-lite - Context Window: 1,048,576 tokens (1M) - Max Output: 65,536 tokens - Modalities: Text, image, video, audio, PDF input - API: Google AI (generativelanguage.googleapis.com) - Thinking Mode: Not supported - Tool Use: Basic function calling - Status: GA (May 2026)

Context Window: 1,048,576 tokens

Max Output: 65,536 tokens

How to Use This Model

To use Gemini 3.1 Flash-Lite via the HInow.ai API, use the model ID: google/gemini-3.1-flash-lite

API Request Example (Chat/Text)


POST https://api.hinow.ai/v1/chat/completions
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "model": "google/gemini-3.1-flash-lite",
  "messages": [
    {"role": "user", "content": "Your message here"}
  ]
}
              

API Request Example (Image Generation)


POST https://api.hinow.ai/v1/images
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "model": "google/gemini-3.1-flash-lite",
  "prompt": "Your image description here"
}
              

Pricing

  • input: $0.275
  • output: $1.65

Available Parameters

  • temperature: Controls randomness (0-2). Default: 1 (Options: 0, 0.3, 0.5, 0.7, 1.0, 1.5, 2.0)
  • top_p: Nucleus sampling (0-1). Default: 0.95 (Options: 0.1, 0.5, 0.7, 0.9, 0.95, 1.0)
  • max_tokens: Max tokens to generate (1-65536) (Options: 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536)
  • response_format: Output format (Options: text, json_object)

Quick Reference

To use this model, set: "model": "google/gemini-3.1-flash-lite"

Featured: No

Documentation: https://hinow.ai/models/google/gemini-3.1-flash-lite

API Endpoint: https://api.hinow.ai/v1

Back to Models

Gemini 3.1 Flash-Lite

google/gemini-3.1-flash-lite

$0.275 / $1.65
per 1M tokens (in/out)

About

Gemini 3.1 Flash-Lite is Google's most cost-efficient model in the 3.x lineup. Optimized for high-volume, cost-sensitive workloads with fast response times and a 1M context window.

Key Features:

  • 1M token context window (1,048,576 tokens)
  • Up to 65K output tokens
  • Vision: text, image, video, audio, PDF input
  • Lightweight agentic workflows
  • Simple data extraction and classification
  • High throughput optimization

Capabilities:

  • Fast text generation and chat
  • Data extraction and classification
  • Document summarization
  • Simple reasoning tasks
  • Multimodal processing (text, images, video)
  • Structured data generation
  • High-volume batch processing
  • Simple tool use

Best For:

  • High-volume, low-latency applications
  • Simple data extraction and classification
  • Cost-effective inference at massive scale
  • Chat and conversational AI
  • Lightweight agentic tasks

Technical Specs:

  • Model ID: gemini-3.1-flash-lite
  • Context Window: 1,048,576 tokens (1M)
  • Max Output: 65,536 tokens
  • Modalities: Text, image, video, audio, PDF input
  • API: Google AI (generativelanguage.googleapis.com)
  • Thinking Mode: Not supported
  • Tool Use: Basic function calling
  • Status: GA (May 2026)

Capabilities

Image To TextText To Text
Context1049K tokens
Max Output66K tokens

Parameters

temperature

Controls randomness (0-2). Default: 1

00.30.50.71.01.52.0
top_p

Nucleus sampling (0-1). Default: 0.95

0.10.50.70.90.951.0
max_tokens

Max tokens to generate (1-65536)

2565121024204840968192163843276865536
response_format

Output format

textjson_object

Code Examples

curl -X POST https://api.hinow.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $HINOW_API_KEY" \
  -d '{
    "model": "google/gemini-3.1-flash-lite",
    "messages": [
      {
        "role": "user",
        "content": [
          {"type": "text", "text": "Describe this image"},
          {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]
      }
    ],
    "parameters": {
      "temperature": "0",
      "top_p": "0.1",
      "max_tokens": "256",
      "response_format": "text"
    }
  }'