Introduction Quickstart Deployment Environment Variables Model Config Slash Commands File Uploads Exporting Chats API Proxy Cloudflare Worker FAQ Changelog

Axecodi Documentation

Everything you need to deploy, configure, and use Axecodi — the AI chat interface powered by NVIDIA NIM and GLM-5.1.

ℹ️
What is Axecodi?

Axecodi is a self-hostable AI chat frontend that connects to NVIDIA NIM's API. Deploy it on Cloudflare Pages for free in under 3 minutes.

Architecture Overview

Axecodi consists of three components working together:

  • Frontend — Static HTML/CSS/JS pages (this project). Hosted on Cloudflare Pages.
  • API Proxy — A Cloudflare Worker that forwards requests to NVIDIA NIM, keeping your API key secure.
  • NVIDIA NIM — The model inference backend powering all AI responses.

Quickstart

Get Axecodi running in minutes with no backend infrastructure required.

1. Get your NVIDIA NIM API Key

Sign up at build.nvidia.com and navigate to API Keys to generate your key. The free tier includes generous credits to get started.

2. Clone or download the project

bash
# Download the project files
git clone https://github.com/yourrepo/axecodi.git
cd axecodi

3. Deploy to Cloudflare Pages

Go to pages.cloudflare.com → Create a project → Upload the project folder directly. No build step required — it's pure HTML.

4. Set environment variables

In your Cloudflare Pages project settings, navigate to Settings → Environment Variables and add:

env
NVIDIA_API_KEY=nvapi-xxxxxxxxxxxxxxxxxxxx
You're live!

Your Axecodi instance is now available at yourproject.pages.dev. Share it with anyone — no accounts needed to chat.


Deployment

Cloudflare Pages (Recommended)

The easiest and fastest option. Free tier includes 500 deployments/month and unlimited bandwidth.

bash
# Using Wrangler CLI
npm install -g wrangler
wrangler pages deploy . --project-name axecodi

Custom Domain

In Cloudflare Pages → Custom Domains, add your domain. HTTPS is automatic via Cloudflare's edge.


Environment Variables

Configure Axecodi via environment variables in your Cloudflare Pages settings.

VariableDescriptionRequired
NVIDIA_API_KEY Your NVIDIA NIM API key from build.nvidia.com Required
DEFAULT_MODEL Default model ID to use on load (default: glm-4-5) Optional
MAX_TOKENS Maximum tokens per response (default: 2048) Optional
SYSTEM_PROMPT Default system prompt injected into all conversations Optional
RATE_LIMIT Max requests per minute per IP (default: 30) Optional

Slash Commands

Type / in the chat input to access powerful commands:

CommandDescription
/helpShow all available commands
/model [name]Switch to a different AI model
/temp [0-2]Set temperature for response randomness
/system [prompt]Set a custom system prompt for this conversation
/exportExport the current chat as Markdown
/clearClear the current conversation
/retryRegenerate the last AI response
/tokensShow token count for current conversation

File Uploads

Axecodi supports attaching files to your messages. Click the 📎 button or paste an image with Ctrl+V.

  • Images — PNG, JPEG, WebP, GIF (up to 20MB)
  • Documents — PDF, TXT, Markdown (up to 10MB)
  • Code files — Any plain text format
  • Data — CSV, JSON (up to 5MB)
⚠️
Model compatibility

Not all models support file inputs. GLM-5.1, Llama 3.3 70B, and Qwen 2.5 72B have vision/document understanding. Check the Models page for compatibility.


API Proxy

The Cloudflare Worker acts as a secure proxy between the frontend and NVIDIA NIM. It handles authentication, rate limiting, and CORS.

javascript · functions/api/chat.js
export async function onRequestPost(context) {
  const { request, env } = context;
  const body = await request.json();

  const response = await fetch(
    'https://integrate.api.nvidia.com/v1/chat/completions',
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${env.NVIDIA_API_KEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify(body),
    }
  );

  return new Response(response.body, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Access-Control-Allow-Origin': '*',
    },
  });
}

FAQ

Is my API key exposed to users?

No. The API key is stored as a Cloudflare environment variable and never sent to the browser. All requests go through the Cloudflare Worker which injects the key server-side.

Can I use a different AI provider?

Yes! The proxy worker can be modified to point to any OpenAI-compatible API (OpenRouter, Together AI, Groq, etc.) by changing the endpoint URL and auth format.

Where is chat history stored?

All conversation history is stored in localStorage in the user's browser. Nothing is sent to any server except the actual messages during inference.

Is there a rate limit?

By default, the Cloudflare Worker enforces 30 requests/minute per IP. This can be adjusted via the RATE_LIMIT environment variable.


Changelog

v2.0.0 — May 2025

  • Added GLM-5.1 as the new flagship model with chain-of-thought rendering
  • Redesigned UI with new dark theme and grid background
  • Multi-conversation support with local history
  • Slash commands system
  • File upload support (images, PDFs, code)
  • Temperature slider in UI
  • Export to Markdown

v1.5.0 — February 2025

  • Added DeepSeek R1 and Qwen 2.5 models
  • Streaming token-by-token responses
  • Clipboard image paste (Ctrl+V)
  • Code block syntax highlighting