decoder-only

Star

Here are 28 public repositories matching this topic...

MKnoche / DONUT

Star

[ICCV 2025] DONUT: A Decoder-Only Model for Trajectory Prediction

autonomous-driving trajectory-prediction motion-prediction argoverse decoder-only iccv2025

Updated Mar 23, 2026
Python

microsoft / encoder-decoder-slm

Star

Efficient encoder-decoder architecture for small language models (≤1B parameters) with cross-architecture knowledge distillation and vision-language capabilities

encoder-decoder vision-and-language llm decoder-only

Updated Feb 7, 2025
Python

liaoyanqing666 / Decoder-only-transformer_Time_Series_Prediction

Star

使用Decoder-only的Transformer进行时序预测，包含SwiGLU和RoPE(Rotary Positional Embedding)，Time series prediction using Decoder-only Transformer, Including SwiGLU and RoPE(Rotary Positional Embedding)

time-series pytorch transformer rope time-series-prediction decoder-only rotary-positional-embedding swiglu

Updated Jan 25, 2024
Python

pittisl / mPnP-LLM

Star

Code for paper "Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AI"

deep-learning multimodal embodied-ai large-language-model decoder-only modality-adaptation

Updated Jan 19, 2024
Python

cisnlp / MEXA

Star

[ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

multilingual evaluation embeddings evaluation-metrics cross-lingual multilingual-nlp large-language-models decoder-only

Updated Apr 6, 2025
Python

SanMumumu / SAMPO

Star

SAMPO: Scale-wise Autoregression with Motion Prompt for Generative World Models

generative-model var decoder-only

Updated Apr 5, 2026
Python

Ayush-Aditya / decoder-only-seq2seq

Star

Minimal decoder-only seq2seq pipeline with proper causal masking, teacher forcing, Ignite training loop, and checkpointed inference

nlp machine-learning deep-learning pytorch transformer seq2seq-model pytorch-implementation autoregressive-models decoder-only causal-masking

Updated Feb 23, 2026
Python

ntphuc149 / ViAG

Star

ViAG: A Novel Framework for Fine-tuning Answer Generation models ultilizing Encoder-Decoder and Decoder-only Transformers's architecture

meteor question-answering bart llama rouge bleu-score encoder-decoder fine-tuning answer-generation t5 plms bartpho llm bertscore instruction-tuning qlora qwen decoder-only vit5

Updated May 26, 2025
Python

pablo-reyes8 / implementing-gpt

Star

Clean-room GPT-2/GPT-3 implementation: tokenizers, architecture blocks, training loop with AdamW + cosine decay, CLI scripts, inference tools, and pytest suite. Covers OpenWebText-10k & WikiText-103 workflows. Designed as an academic reference for understanding and scaling decoder-only transformers

nlp transformers pytorch gpu-acceleration language-model adamw gpt2 gpt3 cosine-decay decoder-only educational-implementation

Updated Feb 18, 2026
Python

msmrexe / pytorch-gpt2-persian-sentiment-generation

Star

A from-scratch implementation of a scaled-down GPT-2 model in PyTorch, trained on the Snappfood dataset for sentiment-controlled Persian text generation.

deep-learning university-project text-generation transformer course-project self-attention review-generation gpt2 positional-embedding decoder-only persian-text-generation causal-attention

Updated Nov 2, 2025
Python

tiagomonteiro0715 / amalia_core

Star

Open source implementation of AMALIA: A Fully Open Large Language Model for European Portuguese

Updated Jul 2, 2026
Python

michaelbabsek / LLM

Star

attention-mechanism multihead-attention llm llm-training llm-inference decoder-only

Updated Jun 2, 2025
Python

MohamedAklamaash / nanocoder-math

Star

Train a decoder-only GPT language model from scratch for code and math reasoning — custom 16k BPE tokenizer, streaming data pipeline, ablation studies, and hardware-aware training on Apple Silicon (MPS) and CUDA.

machine-learning deep-learning tokenizer cuda pytorch transformer code-generation gpt mps language-model from-scratch apple-silicon llm nanogpt decoder-only

Updated Jun 25, 2026
Python

Amir-Hofo / GPT2

Star

Implementation of the GPT-2 architecture using PyTorch, trained on the TinyStories dataset. Features custom training pipelines on Modal (cloud computing) and integration with the Hugging Face ecosystem.

modal transformers pytorch english-nlp gpt2 huggingface decoder-only bpe-tokenizer tiny-stories

Updated Jan 1, 2026
Python

mostafabahaa25 / multi-modal_language_model_pali-gemma

Star

This project is my PyTorch reproduction of PaliGemma, a compact 3B vision–language model that integrates SigLIP vision features with a Gemma decoder. I implemented the full multimodal pipeline from vision encoding to autoregressive text generation to study modern VLM architectures from a research perspective.

machine-learning ocr deep-learning vqa attention image-captioning object-detection language-model gemma image-encoder research-implementation siglip decoder-only

Updated Nov 23, 2025
Python

Diatomic-assay511 / pytorch-gpt2-persian-sentiment-generation

Star

deep-learning university-project text-generation transformer course-project self-attention review-generation gpt2 positional-embedding decoder-only persian-text-generation causal-attention

Updated Jul 5, 2026
Python

Banniesdread / decoder-only-seq2seq

Star

Implement a decoder-only Transformer in PyTorch to reverse character sequences using causal masking and cross-entropy loss with Ignite training support