LLM & Generative AI

Large language models, generative AI, agents, RLHF, multimodal

92 terms

LLM & Generative AI

Agent Evaluation

에이전트 평가

Agent evaluation assesses LLM-based agents that plan, remember, and call tools in external environments by scoring full …

LLM & Generative AI

에이전트 루프

An agent loop is the control cycle that assembles input and accumulated context, invokes an LLM to plan and choose the n…

LLM & Generative AI

에이전틱 RAG

Agentic RAG is a retrieval-augmented generation architecture with an explicit planner (policy) that interleaves multi-st…

ML Fundamentals LLM & Generative AI

Agentic workflows

에이전트 워크플로우

Agentic workflows are dynamic workflows in which multiple specialized AI agents collaborate to plan, reason, use tools, …

LLM & Generative AI

AI 에이전트

An AI agent is a system that pursues a stated goal by perceiving its environment, reasoning and planning, and repeatedly…

Infra & Hardware LLM & Generative AI

AI inference is the runtime phase in which a trained model with fixed weights processes new inputs to produce prediction…

Products & Platforms LLM & Generative AI AI Safety & Ethics

Anthropic is an AI company that provides the Claude family of large language models and a developer platform, distributi…

Deep Learning LLM & Generative AI

Attention is a neural mechanism that computes a weighted aggregation of information by scoring the similarity between a …

LLM & Generative AI

AUC (Area Under the Curve)

곡선 아래 면적

AUC represents the area under the ROC curve and is used as a metric to evaluate the performance of a classification mode…

Infra & Hardware LLM & Generative AI Data Engineering

Batch Inference

Batch inference is an offline prediction method that aggregates a large, fixed set of inputs and generates outputs in bu…

Products & Platforms Infra & Hardware LLM & Generative AI

Amazon Bedrock is a fully managed AWS service that provides secure, enterprise-grade access to multiple foundation model…

CS Fundamentals Data Engineering LLM & Generative AI

BM25 is a probabilistic information-retrieval scoring function that ranks documents by summing per-query-term contributi…

LLM & Generative AI

브라우저 에이전트

A browser agent is an autonomous system that controls a real web browser to complete tasks by running a closed-loop of o…

Products & Platforms LLM & Generative AI

ChatGPT is OpenAI's conversational AI application that turns natural-language user requests into answers or task outputs…

Products & Platforms LLM & Generative AI Deep Learning

Claude is Anthropic’s family of large language models delivered through a developer platform — exposed via the Messages …

LLM & Generative AI

컴퓨터 사용

Computer Use is a tool-and-harness integration pattern where a model perceives the current UI via screenshots, proposes …

LLM & Generative AI

Context Engineering

컨텍스트 엔지니어링

Context engineering is the disciplined selection, organization, and formatting of everything a language model reads on a…

LLM & Generative AI

컨텍스트 윈도우

A context window is the finite working memory of a language model that encompasses all tokens it can reference during ge…

LLM & Generative AI ML Fundamentals

CoT, or Chain-of-Thought, is a reasoning technique that prompts or trains large language models to produce or emulate in…

LLM & Generative AI Data Engineering Deep Learning

크로스 인코더

A Cross-Encoder is an interaction-based neural ranker that concatenates the query and document into a single Transformer…

Infra & Hardware LLM & Generative AI

edge deployment

Edge deployment means running AI models or apps close to where data is created — for example on factory lines, inside re…

LLM & Generative AI Deep Learning Data Engineering

An embedding is a learned vector representation that maps discrete objects or high-dimensional inputs into a continuous …

LLM & Generative AI

Evals are the practice of turning benchmark or user‑study measurements into decision‑ready evidence about a model’s capa…

LLM & Generative AI

Evaluation Harness

평가 하니스

An evaluation harness is a testing framework that runs language models or agents against standardized datasets, prompts,…

LLM & Generative AI Deep Learning ML Fundamentals

Fine-tuning is the process of continuing training from a pretrained model to adapt it to a specific task, domain, style,…

Products & Platforms LLM & Generative AI Deep Learning

Gemini is Google’s family of multimodal generative models delivered through the Gemini API and Vertex AI, handling text,…

Products & Platforms LLM & Generative AI

GPT-4o is OpenAI’s latest large language model that can handle text, speech, and images all at once. It’s designed to be…

Deep Learning LLM & Generative AI

grouped-query attention

그룹 쿼리 어텐션

Grouped-query attention is a method used in large language models (LLMs) and transformer-based AI systems to process sev…

AI Safety & Ethics LLM & Generative AI

Hallucination is a failure mode where an LLM produces fluent content that conflicts with source evidence, real-world fac…

Data Engineering LLM & Generative AI

하이브리드 검색

Hybrid search is an information-retrieval design that runs lexical (keyword/BM25 over an inverted index) and vector (emb…

LLM & Generative AI

In-Context Learning

문맥 내 학습

In-context learning is a pre-trained language model?s ability to adapt to a task at inference by conditioning on instruc…

LLM & Generative AI Infra & Hardware Deep Learning

Inference is the execution phase where a trained model receives new inputs and computes predictions, classifications, or…

LLM & Generative AI Infra & Hardware Products & Platforms

Inference cost is the operational compute-and-infrastructure expense incurred each time a deployed LLM tokenizes a promp…

Infra & Hardware LLM & Generative AI

inference latency

추론 지연 시간

Inference latency is the actual time it takes for an AI model to process an input and return an output. It typically ref…

LLM & Generative AI

Inference-Time Scaling

추론 시점 스케일링

Inference-Time Scaling is a technique that improves a trained model’s outputs without retraining by allocating more comp…

Infra & Hardware LLM & Generative AI

A KV cache is the inference-time memory structure that stores previously computed attention key/value tensors in an auto…

Infra & Hardware LLM & Generative AI

KV 오프로딩

KV offloading is an inference technique that tiers the self-attention Key/Value cache from GPU memory to CPU RAM or stor…

LLM & Generative AI Deep Learning ML Fundamentals

대규모 언어 모델

A large language model is a deep learning system trained on vast text corpora to understand and generate natural languag…

LLM & Generative AI Deep Learning ML Fundamentals

LoRA is a parameter-efficient fine-tuning method that freezes the base model and trains small low-rank adapters instead.

LLM & Generative AI

모델 컨텍스트 프로토콜

Model Context Protocol (MCP) is a stateful JSON-RPC 2.0 client-server protocol that lets AI hosts discover and invoke se…

LLM & Generative AI

An MCP server is the service-side component of Model Context Protocol that exposes server capabilities such as tools, re…

Products & Platforms LLM & Generative AI

미스트랄 AI

Mistral AI is a platform company offering a family of large language models via a first-party API and enterprise product…

LLM & Generative AI

Model Cascading

모델 캐스케이딩

Model cascading is a dynamic routing technique that speculatively runs small, low-cost models first, validates draft res…

Deep Learning LLM & Generative AI

Model Distillation

Model distillation is a training method that teaches a smaller student model to imitate a larger teacher model's output …

Infra & Hardware LLM & Generative AI

Model parallelism

모델 병렬 처리

Model parallelism is a distributed technique that enables training or serving neural networks too large for a single GPU…

LLM & Generative AI Infra & Hardware

모델 라우터

A model router is an orchestration layer that selects which model should handle a request based on difficulty, modality,…

Infra & Hardware LLM & Generative AI Products & Platforms

Model serving is the operational system that deploys a trained model behind APIs, batch jobs, or streaming endpoints and…

LLM & Generative AI Deep Learning

전문가 혼합

Mixture of Experts (MoE) is a sparse conditional-computation architecture in which a gating/router network selects a sma…

LLM & Generative AI

multi-agent system

다중 에이전트 시스템

A multi-agent system is a network of multiple artificial intelligence agents that interact within a shared environment, …

LLM & Generative AI Data Engineering

multi-hop retrieval

다중 홉 검색

Multi-hop retrieval is a technique where an AI system answers complex queries by sequentially retrieving and connecting …

ML Fundamentals LLM & Generative AI

multi-stage training

다단계 학습

Multi-stage training is a method for developing AI models—especially large language models (LLMs)—by progressively impro…

LLM & Generative AI Deep Learning

Multimodal Model

멀티모달 모델

A multimodal model is an AI model designed to process or generate across two or more data modalities, such as text, imag…

LLM & Generative AI

멀티모달 RAG

Multimodal RAG is an extension of retrieval-augmented generation that embeds and indexes heterogeneous data such as text…

LLM & Generative AI Deep Learning ML Fundamentals

자연어 처리

Natural Language Processing (NLP) is an AI discipline that enables computers to interpret and produce human language by …

Products & Platforms LLM & Generative AI Infra & Hardware

NVIDIA provides an end-to-end AI software stack—NVIDIA AI Enterprise—spanning deployment microservices (NIM) and develop…

Infra & Hardware LLM & Generative AI

온디바이스 AI

On-device AI means running artificial intelligence directly on your own device—like a phone, laptop, or tablet—instead o…

LLM & Generative AI

open-source LLM

오픈소스 대형 언어 모델

An open-source large language model (open-source LLM) is a type of AI language model whose underlying code and trained d…

Products & Platforms LLM & Generative AI AI Safety & Ethics

OpenAI is an AI platform and API provider that offers models such as GPT‑5.5 and hosted tools to developers, exposing a …

LLM & Generative AI CS Fundamentals

오픈AI 코덱스

OpenAI Codex is a cloud-based coding agent optimized for software engineering that can implement features, fix bugs, exp…

LLM & Generative AI

Output tokens are pieces of text generated by an AI model in response to input, where the model predicts the next most l…

Infra & Hardware LLM & Generative AI

페이지드 어텐션

PagedAttention is an LLM-serving memory technique that partitions the attention key–value (KV) cache into fixed-size pag…

ML Fundamentals LLM & Generative AI

Post-training is the stage that adapts a pretrained model to instructions, safety requirements, domain behavior, and hum…

ML Fundamentals LLM & Generative AI

Pre-training is the upstream phase that optimizes a model on large, broad data with self-supervised objectives such as n…

LLM & Generative AI

프롬프트 캐싱

Prompt caching is an inference optimization where the provider reuses the model’s prefill state for an exact, sufficient…

AI Safety & Ethics LLM & Generative AI

Prompt Injection

프롬프트 인젝션

Prompt injection is an attack that causes an LLM application to treat untrusted user input or external content as higher…

LLM & Generative AI

PyTorch is an open-source deep learning framework used to build and train neural networks. With its Python-based intuiti…

LLM & Generative AI Data Engineering

검색 증강 생성

Retrieval-Augmented Generation (RAG) couples a retriever with a generator so an LLM conditions on top‑K query‑relevant c…

LLM & Generative AI Data Engineering

Re-ranking is a second-stage ranking step in RAG and search pipelines that jointly processes the user query with each in…

Infra & Hardware LLM & Generative AI

real-time inference

실시간 추론

Real-time inference is a serving paradigm that exposes a trained model as an API to execute and respond immediately upon…

LLM & Generative AI

Reasoning Model

A reasoning model is a specialization of large language models that augments next‑token generation with intermediate rea…

Deep Learning LLM & Generative AI

recurrent mechanism

순환 메커니즘

A recurrent mechanism refers to an architectural design in AI models where the output from a previous step is fed back a…

LLM & Generative AI Deep Learning

인간 피드백 강화학습

Reinforcement Learning from Human Feedback (RLHF) is a post-training alignment method that treats a language model as a …

CS Fundamentals Deep Learning LLM & Generative AI

RoPE(회전 위치 인코딩)

RoPE, or Rotary Position Embedding, is a Transformer positional encoding method that rotates query and key vectors by po…

Data Engineering LLM & Generative AI

상호 순위 융합

Reciprocal Rank Fusion (RRF) is a rank-aggregation algorithm that merges result lists from different retrievers by ignor…

LLM & Generative AI Deep Learning ML Fundamentals

셀프 어텐션

Self-attention is a mechanism where each element in an input sequence compares itself with all other elements to compute…

Deep Learning LLM & Generative AI

Self-Supervised Pretext Tasks

자기지도 사전학습 과제

Self-supervised pretext tasks are label-free training objectives that exploit intrinsic structure in unlabeled data to l…

LLM & Generative AI Infra & Hardware

소형 언어 모델

A Small Language Model (SLM) is a language model that performs natural-language understanding or generation with a small…

LLM & Generative AI Infra & Hardware

Speculative Decoding

추측적 디코딩

Speculative decoding is an inference acceleration method where a smaller drafter proposes multiple candidate tokens and …

LLM & Generative AI

Structured Outputs

구조화된 출력

Structured outputs let an LLM generate only data that matches a user‑provided schema (typically JSON Schema), so require…

ML Fundamentals LLM & Generative AI

supervised fine-tuning

지도 미세 조정

Supervised fine-tuning is the process of further training a pre-trained AI model using additional labeled data, where hu…

LLM & Generative AI

SWE-bench is a software engineering benchmark that evaluates language models and agents on real GitHub issues by providi…

Data Engineering LLM & Generative AI ML Fundamentals

합성 데이터

Synthetic data is artificially generated data created from rules, simulations, statistical models, or generative models …

LLM & Generative AI

텐서플로우

TensorFlow is an open-source machine learning and deep learning framework developed by the Google Brain team, designed f…

LLM & Generative AI

Test-Time Compute

테스트 타임 컴퓨트

Test-time compute is the inference-time budget of model evaluations, generated tokens, and wall-clock latency an LLM spe…

LLM & Generative AI CS Fundamentals

A token is the basic unit an LLM uses to represent and process text instead of reading raw characters or full words dire…

LLM & Generative AI

Tool calling is an interaction mechanism where an LLM, given definitions and input schemas of external tools, emits a st…

LLM & Generative AI

Tool use is an interaction pattern where an LLM emits structured calls against declared tool interfaces while the host a…

Deep Learning LLM & Generative AI

트랜스포머

A Transformer is a neural network architecture that stacks self-attention and feed-forward blocks to learn relationships…

Data Engineering LLM & Generative AI

Vector Database

벡터 데이터베이스

A vector database is a specialized storage and retrieval system for embeddings that supports low-latency nearest-neighbo…

Deep Learning LLM & Generative AI

vision-language model

비전-언어 모델

A vision-language model is an artificial intelligence model designed to simultaneously understand and process both visua…

Deep Learning LLM & Generative AI

Visual Instruction Tuning

시각 지시 학습

Visual instruction tuning is a supervised fine-tuning approach that aligns a vision encoder with a large language model …

Infra & Hardware LLM & Generative AI

vLLM is an open-source LLM serving engine that boosts throughput by managing the attention KV cache with PagedAttention—…