RECENT POSTS

Embeddings: Getting Started

An embedding is a numerical representation of a word, sentence, image, or audio sample in the form of a high-dimensional vector, designed to capture relevant characteristics of the data, such as its semantic meaning. Embeddings are typically generated by machine learning models trained on large amounts of data. During training, these models learn to organize…

Tokenization: BPE Algorithm

NLP models and LLMs do not process raw text directly, but instead operate on numerical representations. In this context, tokenization is the process of converting a sequence of characters (a string) into a sequence of tokens, smaller units of text. These tokens are then mapped to numerical identifiers (integers), which correspond to positions in a…

LangGraph: Human-in-the-Loop & Persistence

Human-in-the-loop (HITL) in LangGraph is a mechanism that allows pausing graph execution to request human intervention, enabling validation, approval, or adjustment of decisions before the flow continues. Persistence in LangGraph allows saving the graph state throughout execution using checkpoints stored in a persistent storage. This enables the workflow to be interrupted and later resumed from…