Test-1-4000 β€” A 190M Parameter Narrative Engine


Overview

Test-1-4000 is the final training checkpoint of a compact decoder-only Transformer model built on the Llama architecture and trained on the TinyStories dataset.

The project focuses on studying how narrative coherence, logical consistency, and language fluency emerge inside small-scale language models through structured training.

By Step 4000, the model reaches a significantly higher level of generative stability and narrative fluency compared to earlier checkpoints, achieving a final training loss of 0.573 after nearly two full epochs of training.


Model Highlights

Feature Specification
Architecture Llama-based Decoder-only Transformer
Parameters 190.55 Million
Context Window 2048 Tokens
Final Training Step 4000
Final Training Loss 0.573
Precision bfloat16
Attention Backend Flash Attention 2
Compilation torch.compile
Tokenizer GPT-2 Tokenizer

Architecture

Component Value
Hidden Dimension 768
Layers 12
Attention Heads 12
Intermediate Size 3072
Activation Function SwiGLU
Normalization RMSNorm
Vocabulary Size 50,257

The model uses Rotary Positional Embeddings (RoPE) for stable long-range token relationships across the 2048-token context window.


Training Progression

Phase 1 β€” Lexical Learning (0 β†’ 250)

The model learned grammar, sentence formation, and common linguistic patterns.

Phase 2 β€” Relational Understanding (250 β†’ 1000)

The model began associating entities, actions, and environments into logically connected sequences.

Phase 3 β€” Narrative Coherence (1000 β†’ 2000)

Narrative continuity emerged. Stories developed stable structure, conflict resolution, and reduced contradiction.

Phase 4 β€” Emergent Narrative Intelligence (2000 β†’ 3000)

The model improved in emotional consistency, long-range memory, and thematic continuity across generations.

Phase 5 β€” Fluent Generative Stability (3000 β†’ 4000)

This final phase marked a transition from structured storytelling into fluent narrative generation.

The model became substantially better at:

  • maintaining tone,
  • producing natural sentence flow,
  • avoiding repetitive degeneration,
  • preserving character consistency,
  • and generating smoother transitions between events.

By this stage, generations began feeling less mechanically predicted and more organically written. Dialogue improved noticeably, pacing became more natural, and narrative structure stabilized across longer outputs.

The reduction in loss to 0.573 indicates a major improvement in predictive confidence and language fluency.


Training Configuration

Parameter Value
Optimizer AdamW
Learning Rate 5e-4
Scheduler OneCycleLR
Weight Decay 0.01
Precision bfloat16
Effective Batch Size ~262K tokens/step

Dataset

The model was trained on TinyStories, a synthetic storytelling dataset designed to teach language models reasoning and narrative structure using simplified vocabulary and clean writing patterns.

This allows the model to focus on:

  • causal reasoning,
  • narrative flow,
  • emotional continuity,
  • and long-range coherence.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "GODELEV/Test-1-4000"

tokenizer = AutoTokenizer.from_pretrained(model_path)

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "Once upon a time, a boy found a silver key."

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

output = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id
)

print(tokenizer.decode(output[0], skip_special_tokens=True))

Final Notes

Test-1-4000 demonstrates that coherent and fluent narrative behavior can emerge in compact Transformer models when training is focused on clean, structured data and long-form consistency.

Despite its relatively small size, the model exhibits:

  • strong narrative fluency,
  • stable story progression,
  • coherent emotional structure,
  • and reliable long-context generation.

The project serves as an exploration into how efficient language models can develop increasingly sophisticated generative behavior through progressive training refinement.


Citation

@misc{test14000,
  title={Test-1-4000: A 190M Parameter Narrative Engine},
  author={GODELEV},
  year={2026}
}
Downloads last month
25
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train GODELEV/Test-1-4000