Browse All Blogs | BotMartz

Weekly

BLOG

Browse All Articles

TheIntelligence
Archive.

Deep-dives, tutorials and essays on AI, automation & the future of the web — written by builders, for builders.

Research Explained

Research Paper Deep Dive: RoPE (Rotary Position Embeddings) — Better Position Information

Soham Sharma 0 0

LLMs

Language Model Architectures: Transformers, Attention, and the Path from GPT-1 to GPT-4

Soham Sharma 0 0

Agents

AI Agent Fundamentals: Decision-Making Loops, Tools, and Agentic vs. Procedural Reasoning

Soham Sharma 0 0

Research Explained·LLMs·Agents·Advanced Models·Optimization·Models·TensorFlow·Research·PyTorch·LangChain·Essays·Research·Insights·Research Explained·LLMs·Agents·Advanced Models·Optimization·Models·TensorFlow·Research·PyTorch·LangChain·Deep Dives·Tutorials·Research Explained·LLMs·Agents·Advanced Models·Optimization·Models·TensorFlow·Research·PyTorch·LangChain·Essays·Research·Insights·Research Explained·LLMs·Agents·Advanced Models·Optimization·Models·TensorFlow·Research·PyTorch·LangChain·Deep Dives·Tutorials·

Sort:

17 articles

Research Explained

Research Paper Deep Dive: RoPE (Rotary Position Embeddings) — Better Position Information

Standard position embeddings are additive and have poor long-range generalization. RoPE embeds positions via rotation: multiply Q, K by rotation matrices. Enables 100K+ token context.

ResearchPosition EmbeddingsRoPE

Soham Sharma

0Jun 12, 2026

LLMs

Language Model Architectures: Transformers, Attention, and the Path from GPT-1 to GPT-4

Modern LLMs are Transformers. Understand the evolution: self-attention, positional encoding, scaling laws, and how each architectural change improved performance.

LLMsArchitectureTransformers

Soham Sharma

0Jun 12, 2026

Agents

AI Agent Fundamentals: Decision-Making Loops, Tools, and Agentic vs. Procedural Reasoning

Agents make autonomous decisions by reasoning, planning, and calling tools. Understand the perception-decision-action loop and when to use agents vs. deterministic workflows.

Mamba: State Space Models and the Alternative to Transformer Attention

Transformers require O(n²) attention. Mamba uses state space models for O(n) complexity with better scaling. Understand selective SSMs and why Mamba matches transformer quality at 1/5 the memory.

Research Paper Deep Dive: Flash Attention 2 — Optimizing Transformer Attention

Flash Attention achieves 2-4× speedup on attention by changing memory access patterns. Understand I/O complexity, tiling, and how to optimize matrix operations on GPUs.

ResearchFlash AttentionTransformers

Soham Sharma

0Jun 8, 2026

Optimization

Optimizer Comparison: SGD, Momentum, Adam, RMSprop, and When Each Shines

Different optimizers suit different problems. SGD is stable, Momentum accelerates, Adam is adaptive. Understand why each optimizer works and pick the right one.

OptimizationTrainingGradient Descent

Soham Sharma

0Jun 8, 2026

Models

Vision Transformers (ViT): Image Classification with Pure Transformers

Vision Transformers apply Transformers to image classification. Patch embeddings convert images to sequences, enabling the same architecture as NLP models.

Computer VisionTransformersImage Classification

Soham Sharma

0Jun 8, 2026

TensorFlow

Custom Training Loops with GradientTape: Manual Forward and Backward Passes in TensorFlow

model.fit() hides the training loop. GradientTape exposes it. Use it when you need per-batch gradient manipulation, custom loss combinations, or training dynamics that Keras callbacks can't express.

TensorFlowGradientTapeCustom Training

Soham Sharma

0Jun 3, 2026

Research

Rotary Positional Embeddings (RoPE): How It Works and Why It Beats Learned Embeddings

RoPE encodes position by rotating query and key vectors in complex space. It extrapolates beyond training length, transfers across fine-tuning, and adds zero parameters — here's the math.

RoPEPositional EmbeddingsTransformers

Soham Sharma

0Jun 3, 2026

PyTorch

PyTorch Custom Dataset and DataLoader: getitem, len, collate_fn, and num_workers

DataLoader is more than a loop — it's a parallel data pipeline. Build a correct Dataset, write a proper collate_fn, and understand num_workers to eliminate training bottlenecks.

PyTorchDataLoaderDataset

Soham Sharma

0Jun 3, 2026

LangChain

Working with LLMs and Chat Models in LangChain: OpenAI, Anthropic, and Local Models via Ollama

LangChain wraps every LLM provider behind the same Runnable interface. Swap OpenAI for Claude or a local Llama model without changing a line of your chain logic.

LangChainOpenAIAnthropic

Soham Sharma

0Jun 2, 2026

TensorFlow

Keras Sequential vs Functional vs Subclassing: When to Use Which API

Keras gives you three model-building APIs. Sequential is a dead end for anything non-trivial. Functional handles 90% of production architectures. Subclassing gives you full control when you need it.

TensorFlowKerasDeep Learning

Soham Sharma

0Apr 27, 2026

Stay Ahead

The Intelligence
Briefing

Weekly dispatches on AI automation, technical deep-dives, and perspectives from the frontier—delivered straight to your inbox.

No spam, ever. Unsubscribe in one click.

Research Paper Deep Dive: RoPE (Rotary Position Embeddings) — Better Position Information

Language Model Architectures: Transformers, Attention, and the Path from GPT-1 to GPT-4

AI Agent Fundamentals: Decision-Making Loops, Tools, and Agentic vs. Procedural Reasoning

Research Paper Deep Dive: RoPE (Rotary Position Embeddings) — Better Position Information

Language Model Architectures: Transformers, Attention, and the Path from GPT-1 to GPT-4

AI Agent Fundamentals: Decision-Making Loops, Tools, and Agentic vs. Procedural Reasoning

Mamba: State Space Models and the Alternative to Transformer Attention

Research Paper Deep Dive: Flash Attention 2 — Optimizing Transformer Attention

Optimizer Comparison: SGD, Momentum, Adam, RMSprop, and When Each Shines

Vision Transformers (ViT): Image Classification with Pure Transformers

Custom Training Loops with GradientTape: Manual Forward and Backward Passes in TensorFlow

Rotary Positional Embeddings (RoPE): How It Works and Why It Beats Learned Embeddings

PyTorch Custom Dataset and DataLoader: __getitem__, __len__, collate_fn, and num_workers

Working with LLMs and Chat Models in LangChain: OpenAI, Anthropic, and Local Models via Ollama

Keras Sequential vs Functional vs Subclassing: When to Use Which API

The IntelligenceBriefing

PyTorch Custom Dataset and DataLoader: getitem, len, collate_fn, and num_workers

The Intelligence
Briefing