Resumes are full of rich, structured information—but extracting this data automatically can be a challenge. In this blog, we’ll walk through how to build a Resume Q&A system using:

LlamaIndex
OpenAI’s Embeddings + GPT models
LlamaParse for intelligent document parsing
Workflow-based orchestration with llama_index.core.workflow

Let’s dive into how you can transform a static resume into an interactive queryable system.

Overview

We’ll build a pipeline that:

Parses a resume PDF and extracts content as markdown.
Embeds the parsed content into a vector index using OpenAI embeddings.
Allows querying via GPT-4o-mini.
Enables reusable workflows and tools for consistent resume analytics.

Prerequisites

Before you begin, make sure you have:

llama-index
llama-parse
openai
nest_asyncio for Jupyter support
OpenAI and LlamaParse API keys

Step 1: Document Ingestion with LlamaParse

We use LlamaParse to extract structured content (in markdown) from resumes:

documents = LlamaParse(
    api_key=llama_cloud_api_key,
    result_type="markdown",
    system_prompt_append="Extract the content from the document and return it in markdown format."
).load_data("data/Fake_Resume.pdf")

This outputs clean, semantically rich text, ready for downstream indexing.

Step 2: Build a Vector Index

Next, we embed the parsed content using OpenAI’s text-embedding-3-small and build a vector index:

index = VectorStoreIndex.from_documents(
    documents,
    embedding=OpenAIEmbedding(
        api_key=openai_api_key,
        model="text-embedding-3-small"
    )
)

Step 3: Query Using GPT-4o-mini

We create a query engine that uses gpt-4o-mini as the backend LLM:

query_engine = index.as_query_engine(
    llm=OpenAI(model="gpt-4o-mini"),
    similarity_top_k=3
)

Now you can ask natural language questions like:

response = query_engine.query("What is the name of the person and their current job title?")
print(response)

Output:

The name of the person in the resume is Homer Simpson, and their current job title is Night Auditor.

Step 4: Persist and Reload the Index

To avoid rebuilding the index every time, persist it to disk:

index.storage_context.persist(persist_dir="./storage")

Later, you can reload it using:

storage_context = StorageContext.from_defaults(persist_dir="./storage")
restored_index = load_index_from_storage(storage_context)

Step 5: Turn Q&A Into a Tool

You can create reusable tools with FunctionTool and invoke them via a FunctionCallingAgent:

def query_resume(query: str) -> str:
    return str(query_engine.query(query))

resume_tool = FunctionTool.from_defaults(fn=query_resume)
agent = FunctionCallingAgent.from_tools(
                         tools=[resume_tool],
                         llm=llm,
          verbose=True,)

Chat with your resume:

response = agent.chat("How many years of experience does the applicant have?")
print(response)

Output:

> Running step 7f4678c6-35e6-406a-a468-a202c5ae760e. Step input: How many years of experience does the applicant has? 
Added user message to memory: How many years of experience does the applicant has? 
=== Calling Function ===
Calling function: query_resume with args: {"query": "years of experience"}
=== Function Output ===
The name of the person in the resume is Homer Simpson, and their current job title is Night Auditor.
> Running step 21ac778c-2bb1-404a-b70b-8773e97cfbef. Step input: None
=== Calling Function ===
Calling function: query_resume with args: {"query": "Homer Simpson's years of experience"}
=== Function Output ===
The name of the person in the resume is Homer Simpson, and their current job title is Night Auditor.
> Running step 325fcb29-9753-47c1-b62d-9a18fcb17d18. Step input: None
=== LLM Response ===
It seems that I wasn't able to retrieve the specific years of experience for the applicant, Homer Simpson. If you have more details or specific sections of the resume you'd like me to check, please let me know!
It seems that I wasn't able to retrieve the specific years of experience for the applicant, Homer Simpson. If you have more details or specific sections of the resume you'd like me to check, please let me know!

Surprisingly, the fake resume which i have taken does does not have experience mentioned.Also the prints your are seeing twice because he have mentioned verbose.

Step 6: Create a Workflow

LlamaIndex provides a declarative workflow API. Here’s a two-step RAGWorkflow:

The rag workflow we have created using all the experiments we have done individually above. The same code has been leveraged to create the workflow.

Flow Diagram:

StartEvent(resume_file, query)
       |
       v
+--------------------+
| setup_workflow     |
|--------------------|
| Check file exists|
| Load or build index |
| Persist if new    |
| Create query engine|
+--------------------+
       |
       v
QueryEvent(query)

class RAGWorkflow(Workflow):
    storage_dir = "./storage"
    llm: OpenAI
    query_engine = VectorStoreIndex

    @step
    async def setup_workflow(self, ctx: Context, ev: StartEvent) -> QueryEvent:

        _{#Check if resume file is provided}
        if not ev.resume_file:
            raise ValueError("Resume file is required to setup the workflow.")

      _{# Initialize the LLM}
        self.llm = OpenAI(model="gpt-4o-mini")


        _{#Load Index from Persistent Storage (if available)}
        if os.path.exists(self.storage_dir):
            storage_context = StorageContext.from_defaults(persist_dir=self.storage_dir)
            index = load_index_from_storage(storage_context)
        else:
            _{#parse and load your documents}
            documents = LlamaParse(
                api_key=llama_cloud_api_key,
                result_type="markdown",
                system_prompt_append="Extract the resume content from the document and return it in markdown format."
            ).load_data(ev.resume_file)

            _{#embed and Index the documents}
            index = VectorStoreIndex.from_documents(
                documents,
                embedding=OpenAIEmbedding(
                    model_name="text-embedding-3-small",
                )
            )
            index.storage_context.persist(persist_dir=self.storage_dir)

        _{# Create the query engine}
        self.query_engine = index.as_query_engine(
            llm=self.llm,
            similarity_top_k=3
        )

       _{#Emit the QueryEvent to be consumed by ask_question function}
        return QueryEvent(query=ev.query)


    @step
    async def ask_question(self, ctx: Context, ev: QueryEvent) -> StopEvent:
        response = self.query_engine.query(ev.query)
        return StopEvent(result=str(response))

Run the workflow:

w = RAGWorkflow(timeout=60, verbose=False)
result = await w.run(
    resume_file="data/Fake_Resume.pdf",
    query="What is the name of the person and what is their current job title?"
)
print(result)

Output:

Loading llama_index.core.storage.kvstore.simple_kvstore from ./storage/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./storage/index_store.json.
The name of the person in the resume is Homer Simpson, and their current job title is Night Auditor.

Visualize the Workflow

Generate a visualization of the workflow logic:

draw_all_possible_flows(w, filename="workflows/rag.html")

This is useful for documentation or debugging multi-step workflows.

Conclusion

With just a few lines of code and the power of LlamaIndex, OpenAI, and LlamaParse, we’ve built an intelligent resume analysis system. This setup can be extended to handle:

Multiple resumes
ATS (Applicant Tracking System) integrations
Advanced analytics and scoring models

GitHub & Resources

Author Profile

Sirin ShaikhAI | Amplifying Impact: Talks about AI | GenAI | Machine Learning | Cloud | Kubernetes

Latest entries

AgenticAIAugust 12, 2025Pipeline Companion – an AWS Strands Agent for Data Pipeline Monitoring
AgenticAIAugust 7, 2025AWS Strand Agent – integration with Researcher MCP server
AgenticAIAugust 5, 2025Building an MCP Server Using FastMCP and arXiv
AgenticAIAugust 3, 2025Building a Resume Question-Answering System Using LlamaIndex, OpenAI, and LlamaParse

Overview

Prerequisites

Step 1: Document Ingestion with LlamaParse

Step 2: Build a Vector Index

Step 3: Query Using GPT-4o-mini

Step 4: Persist and Reload the Index

Step 5: Turn Q&A Into a Tool

Step 6: Create a Workflow

Visualize the Workflow

Conclusion

GitHub & Resources

Author Profile

Latest entries

Leave a Comment Cancel reply