Axecodi Documentation

Everything you need to deploy, configure, and use Axecodi — the AI chat interface powered by NVIDIA NIM and GLM-5.1.

ℹ️

What is Axecodi?

Axecodi is a self-hostable AI chat frontend that connects to NVIDIA NIM's API. Deploy it on Cloudflare Pages for free in under 3 minutes.

Architecture Overview

Axecodi consists of three components working together:

Frontend — Static HTML/CSS/JS pages (this project). Hosted on Cloudflare Pages.
API Proxy — A Cloudflare Worker that forwards requests to NVIDIA NIM, keeping your API key secure.
NVIDIA NIM — The model inference backend powering all AI responses.

Quickstart

Get Axecodi running in minutes with no backend infrastructure required.

1. Get your NVIDIA NIM API Key

Sign up at build.nvidia.com and navigate to API Keys to generate your key. The free tier includes generous credits to get started.

2. Clone or download the project

bash

# Download the project files
git clone https://github.com/yourrepo/axecodi.git
cd axecodi

3. Deploy to Cloudflare Pages

Go to pages.cloudflare.com → Create a project → Upload the project folder directly. No build step required — it's pure HTML.

4. Set environment variables

In your Cloudflare Pages project settings, navigate to Settings → Environment Variables and add:

env

NVIDIA_API_KEY=nvapi-xxxxxxxxxxxxxxxxxxxx

✅

You're live!

Your Axecodi instance is now available at yourproject.pages.dev. Share it with anyone — no accounts needed to chat.

Deployment

Cloudflare Pages (Recommended)

The easiest and fastest option. Free tier includes 500 deployments/month and unlimited bandwidth.

bash

# Using Wrangler CLI
npm install -g wrangler
wrangler pages deploy . --project-name axecodi

Custom Domain

In Cloudflare Pages → Custom Domains, add your domain. HTTPS is automatic via Cloudflare's edge.

Environment Variables

Configure Axecodi via environment variables in your Cloudflare Pages settings.

Variable	Description	Required
NVIDIA_API_KEY	Your NVIDIA NIM API key from build.nvidia.com	Required
DEFAULT_MODEL	Default model ID to use on load (default: glm-4-5)	Optional
MAX_TOKENS	Maximum tokens per response (default: 2048)	Optional
SYSTEM_PROMPT	Default system prompt injected into all conversations	Optional
RATE_LIMIT	Max requests per minute per IP (default: 30)	Optional

Slash Commands

Type / in the chat input to access powerful commands:

Command	Description
/help	Show all available commands
/model [name]	Switch to a different AI model
/temp [0-2]	Set temperature for response randomness
/system [prompt]	Set a custom system prompt for this conversation
/export	Export the current chat as Markdown
/clear	Clear the current conversation
/retry	Regenerate the last AI response
/tokens	Show token count for current conversation

File Uploads

Axecodi supports attaching files to your messages. Click the 📎 button or paste an image with Ctrl+V.

Images — PNG, JPEG, WebP, GIF (up to 20MB)
Documents — PDF, TXT, Markdown (up to 10MB)
Code files — Any plain text format
Data — CSV, JSON (up to 5MB)

⚠️

Model compatibility

Not all models support file inputs. GLM-5.1, Llama 3.3 70B, and Qwen 2.5 72B have vision/document understanding. Check the Models page for compatibility.

API Proxy

The Cloudflare Worker acts as a secure proxy between the frontend and NVIDIA NIM. It handles authentication, rate limiting, and CORS.

javascript · functions/api/chat.js

export async function onRequestPost(context) {
  const { request, env } = context;
  const body = await request.json();

  const response = await fetch(
    'https://integrate.api.nvidia.com/v1/chat/completions',
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${env.NVIDIA_API_KEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify(body),
    }
  );

  return new Response(response.body, {
    headers: {
      'Content-Type': 'text/event-stream',
      'Access-Control-Allow-Origin': '*',
    },
  });
}

FAQ

Is my API key exposed to users?

No. The API key is stored as a Cloudflare environment variable and never sent to the browser. All requests go through the Cloudflare Worker which injects the key server-side.

Can I use a different AI provider?

Yes! The proxy worker can be modified to point to any OpenAI-compatible API (OpenRouter, Together AI, Groq, etc.) by changing the endpoint URL and auth format.

Where is chat history stored?

All conversation history is stored in localStorage in the user's browser. Nothing is sent to any server except the actual messages during inference.

Is there a rate limit?

By default, the Cloudflare Worker enforces 30 requests/minute per IP. This can be adjusted via the RATE_LIMIT environment variable.

Changelog

v2.0.0 — May 2025

Added GLM-5.1 as the new flagship model with chain-of-thought rendering
Redesigned UI with new dark theme and grid background
Multi-conversation support with local history
Slash commands system
File upload support (images, PDFs, code)
Temperature slider in UI
Export to Markdown

v1.5.0 — February 2025

Added DeepSeek R1 and Qwen 2.5 models
Streaming token-by-token responses
Clipboard image paste (Ctrl+V)
Code block syntax highlighting