LLM API Beginner's Guide

New to LLM APIs? This guide helps you understand the core concepts quickly

What is an LLM?

LLM (Large Language Model) is an AI model trained on massive text datasets that can understand and generate natural language. Common LLMs include OpenAI's GPT series, Anthropic's Claude series, and Google's Gemini series.

What is an API?

API (Application Programming Interface) is a communication interface between applications. LLM APIs let you call these models via code to get text generation, translation, summarization, and more — without training your own model.

What are Tokens?

Tokens are the basic units LLMs use to process text. Approximately 1 Chinese character ≈ 1-2 tokens, 1 English word ≈ 1 token. API billing is based on token count.

Input vs Output Tokens

Input tokens are the text you send to the model (e.g., your question). Output tokens are the text the model returns (e.g., the answer). Output pricing is typically 2-4x the input price.

Context Window

Context Window is the maximum number of tokens a model can process at once. For example, 128K tokens can handle roughly a 100-page book. Text beyond the limit gets truncated.

How to Choose a Model?

When choosing a model, consider these factors:

  • Task complexity: Use cheaper models for simple tasks (e.g., GPT-4o mini), top-tier models for complex reasoning (e.g., Claude Opus)
  • Budget: Prices range from $0.05 to $75 per million tokens — a 1500x difference
  • Speed needs: Some models respond faster (e.g., Gemini Flash), ideal for real-time interactions
  • Context needs: Long document processing requires large context windows (128K+)

Common Pricing Models

LLM APIs have multiple pricing dimensions:

  • Input price: Cost per million input tokens
  • Output price: Cost per million output tokens
  • Cached price: Discounted rate for reusing the same prompt (typically 50% off)
  • Batch price: Non-real-time batch processing, typically half price

How to Get Started?

Choose a provider (e.g., OpenAI, Anthropic) and create an account. Go to the API Keys page and create a key. Use HTTP requests or an SDK to call the API. Start with the cheapest model to test, then upgrade based on your needs.