25 RAG

Tech

25 RAG

kchabin 2025. 3. 10. 22:41

※ Bhavishya Pandit의 25Types of RAG를 읽고 정리한 글입니다.

https://www.linkedin.com/in/bhavishya-pandit/

Standard RAG

참고한 글 https://medium.com/@jalajagr/rag-series-part-2-standard-rag-1c5f979b7a92

RAG Series : Part 2: Standard RAG

What is Standard RAG?

medium.com

검색 + LLM
문서 청킹
실시간 사용을 위한 1~2초 답변 목표
외부 데이터 소스 활용 -> 답변 질 향상

Corrective RAG

https://cobusgreyling.medium.com/corrective-rag-crag-5e40467099f8

생성된 응답의 에러를 찾고, 고치는데 집중
유저 피드백을 correction process를 향상 시키는데 활용한다.
기본 rag보다 높은 정확도, 유저 만족도에 집중함

Retrieval Evaluator가 중요한 컴포넌트로, 검색된 문서들의 쿼리와의 연관 정도를 판단해서 informative generation에 기여한다.
Decompse - Recompose 알고리즘 : 검색된 문서의 분해를 통해 중복된 맥락, 필수적이지 않은 요소를 최소화 → 최적화

chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://arxiv.org/pdf/2401.15884

Incorrect, Ambiguous 같은 경우 X라는 유저 쿼리를 q 로 재작성해서 웹 서칭을 통한 External_knowledge를 추가한다. Ambiguous는 Internal + External.

cons

외부의 evaluator를 사용해서 검색된 문서 품질 개선에만 초점이 있고, 모델의 추론 능력 자체를 향상시키진 못함

Speculative RAG

https://arxiv.org/html/2407.08223v1?utm_source=turingpost.co.kr&utm_medium=referral&utm_campaign=topic-9-speculative-rag

https://cobusgreyling.medium.com/speculative-rag-by-google-research-444f7b7ef296

Speculative RAG By Google Research

This study shows how to enhance Retrieval Augmented Generation (RAG) through Drafting

cobusgreyling.medium.com

Corrective RAG (Yan et al., 2024) on the other hand proposes a lightweight retrieval evaluator, but it lacks the capability for high-level reasoning. In contrast, our proposed Speculative RAG addresses these limitations by leveraging a smaller RAG drafter model to efficiently understand diverse perspectives in retrieval results and generate drafts for the generalist LMs to verify and integrate.

→ ‘드래프트 (초안)’ 작업을 소형의 ‘전문’ 모델에 맡겨서 대형 ‘범용’ 모델의 계산 부담을 줄이고 ‘검증에만 집중’

지식집약적 작업
RAG Drafter : 소형, 전문 모델
- SLM이 활용됨
- 문서 검색 → 부분집합 생성 → 병렬 처리 → draft 당 입력 토큰 수 감소
RAG Verifier(Generalist LM) : Draft를 평가, 검증한 다음 가장 좋은 답변 선택
- 범용 모델은 검증, 통합만. 추가적인 튜닝을 필요로 하지 않음
Lost-in-the-middle 현상 완화
latency 감소

cons

구현 복잡성
훈련 오버헤드
- RAG Drafter에 추가 훈련 필요
- 자원, 시간 필요
- 빠른 배포가 어려움
검색 품질 의존성 → 안좋은 검색결과로 최적이 아닌 드래프트가 생성됨

응용 분야

지식 집약적 질문 답변
실시간 정보 검색 : accuracy, latency
의료 및 법률 텍스트 분석

추후 구현 https://www.datacamp.com/tutorial/speculative-rag

Fusion RAG

RAG + Reciprocal Rank Fusion

처음 들어온 original query를 기반으로 새로운 쿼리들을 생성한다.
“Tell me about MEMs microphones”가 오리지널 쿼리였다면 생성된 쿼리들은 다음과 같다.

Recioprocal rank fusion

https://arxiv.org/pdf/2402.03367

모든 문서에 점수를 할당하고, 이 점수를 기반으로 순위를 매기는 알고리즘

conventional rag보다 정확하고 포괄적인 답변을 제공할 수 있음

but 응답시간 지연, 원래 쿼리에서 벗어난 답변, 적절한 프롬프트 엔지니어링 필요

Self RAG

SELF-RAG is a framework that enhances the quality and factuality of an LLM through retrieval and self-reflection

Retrieve 토큰이 포함되면 해당 질문과 관련된 내용을 검색함

토큰이 없는 경우 모델이 검색 없이 직접 답변 생성

Reflection Tokens

Retrieve Token
Critique Token

생성 과정에서 더 정밀한 제어를 위해 reflection tokens를 사용한다

[Retrieve]: 자원 R에서 정보를 검색할지 여부를 결정하는 과정
[IsREL]: 주어진 데이터 d가 문제 x를 해결하는 데 필요한 정보를 포함하고 있는지 여부를 판단하는 관련성 검사
[IsSUP]: 제공된 응답 y의 진술이 데이터 d에 의해 지원되는지 확인하는 검증 과정
[IsUSE]: 문제 x에 대한 응답 y의 유용성을 평가하는 과정 (1~5, 높을수록 유연성 커짐)

검색 필요 → 검색

검색 필요 x → 다음 output segment $y_t$ 예측

Critique 토큰 예측 → 검색 passage 관련성 판단

그 후 응답에 대해 critique 진행

저작자표시 비영리 변경금지

현재글25 RAG

kchabin's log

보안 -> 프론트엔드 -> 백엔드 공부중인 예비 백엔드 개발자✨ SKT DEVOCEAN YOUNG 2기 우수활동자 Kubernetes Korea Group Organizer velog : https://velog.io/@kchabin/posts notion : https://kchabin.notion.site/KCHABIN-s-blog-bcc6fbe4c5914c20a40d946eb1da09cd?pvs=4

ChatGPT, 코테스터디, kubebuilder, 개발자취업, 데보션영, SpringBatch, RAG, cryptohack, DigitalForensic, 항해99, 스프링배치, AI, til, langchain, 코테준비, 99클럽, 코딩테스트준비, CTFlearn, LLM, DEVOCEAN,

Today :
Yesterday :

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

kchabin's log

25 RAG

Standard RAG

Corrective RAG

Speculative RAG

Fusion RAG

Recioprocal rank fusion

Self RAG

Reflection Tokens

'Tech'의 다른글

티스토리툴바

25 RAG

Standard RAG

Corrective RAG

Speculative RAG

Fusion RAG

Recioprocal rank fusion

Self RAG

Reflection Tokens

'Tech'의 다른글

티스토리툴바

'Tech'의 다른글