SuperSCC.rag.SimpleRAG

class SuperSCC.rag.SimpleRAG(file_path: str, file_type: str)[source]

A class that encapsulates a complete Retrieval-Augmented Generation (RAG) pipeline, from data loading and processing to answer generation and citation

Parameters:
  • (str) (file_type) – The path to the file or the root directory containing the documents to be processed.

  • (str) – The file extension to look for (e.g., “pdf”, “csv”). This determines which files are loaded.

__init__(file_path: str, file_type: str)[source]

Methods

__init__(file_path, file_type)

add_documents(file_path, file_type[, ...])

change_text_embedding(model_name[, ...])

create_rag_chain(vector_store, model, ...[, ...])

data_loader(file_path[, mode, metadata_columns])

format_docs(docs)

get_all_ids()

get_answer(gene_list[, query, ...])

The main entry point for asking a question.

get_relevant_segments()

highlight_docs()

hybrid_search([hierarchy_search, key, value])

recursive_search(path[, type])

refine_query()

rerank([model, top_n])

run_rag(qdrant_location, ...[, qdrant_host, ...])

Executes the entire RAG pipeline from scratch: loading, splitting, encoding, and creating the chain.

score_documents([docs])

summary_res(res)

text_encode(text, model_name, location[, ...])

text_split(docs[, chunk_size, ...])

translator([query])

update_rag_chain([model, api_key, base_url, ...])

Updates components of the existing RAG chain, such as the LLM or prompt.