{"id":144,"date":"2025-08-05T07:56:59","date_gmt":"2025-08-05T07:56:59","guid":{"rendered":"https:\/\/aiinfrahub.com\/about-us\/?p=144"},"modified":"2025-08-05T19:09:54","modified_gmt":"2025-08-05T19:09:54","slug":"building-an-mcp-server-using-fastmcp-and-arxiv","status":"publish","type":"post","link":"https:\/\/aiinfrahub.com\/about-us\/building-an-mcp-server-using-fastmcp-and-arxiv\/","title":{"rendered":"Building an MCP Server Using FastMCP and arXiv"},"content":{"rendered":"\n<p class=\"has-medium-font-size\">In this blog post, we explore how to build a modular, agent-compatible <strong>MCP (Model Context Protocol)<\/strong> server that automates the discovery, storage, and retrieval of research papers from arXiv. By leveraging the FastMCP framework, we expose tool-like interfaces that can be invoked by agents, UIs, or even chat interfaces for smarter academic workflows.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What is an MCP Server?<\/h3>\n\n\n\n<p>An <strong>MCP (Model Context Protocol) Server<\/strong> is a lightweight, modular service that exposes tools or functions in a standard interface so they can be used by AI agents, workflows, or external systems. Built on the FastMCP framework, it allows developers to quickly register Python functions as callable tools over stdin, REST, or other transports \u2014 enabling LLM-driven automation, tool use, and orchestration.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What is FastMCP?<\/h3>\n\n\n\n<p>FastMCP is a lightweight Python framework that lets you wrap functions as <strong>tools<\/strong> inside an <strong>MCP-compatible agent server<\/strong>. Think of it as an easier way to define callable capabilities (e.g., &#8220;search papers&#8221;, &#8220;get status&#8221;) that can be used by AI agents, other tools, or event-driven workflows.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Project Goals<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Create a tool to search research papers from arXiv based on a topic<\/li>\n\n\n\n<li>Store paper metadata (title, authors, summary, etc.) locally per topic<\/li>\n\n\n\n<li>Create another tool to query the research paper information using  <strong>arXiv ID<\/strong><\/li>\n\n\n\n<li>Expose this functionality using FastMCP so it can be called by LLM agents or CLI tools<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Github<\/h3>\n\n\n\n<p><a href=\"https:\/\/github.com\/juggarnautss\/mcp_server_fastmcp_arxiv\">https:\/\/github.com\/juggarnautss\/mcp_server_fastmcp_arxiv<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Dependencies<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install arxiv mcp<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Directory Structure<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mcp_server\/\n\u251c\u2500\u2500 research_server.py   # Main server file\n\u2514\u2500\u2500 research_papers\/     # Stores metadata per topic<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Tool 1 : Search and Store Papers (<code>search_arxiv<\/code>)<\/h3>\n\n\n\n<p>This tool performs a search on arXiv for a given topic, processes the results, and stores them in a local JSON file under a directory named after the topic.<\/p>\n\n\n\n<pre class=\"wp-block-code has-very-light-gray-to-cyan-bluish-gray-gradient-background has-background has-small-font-size\"><code>@mcp.tool()\ndef search_arxiv(topic: str, max_results: int = 5) -> List&#91;str]:\n    \"\"\"\n    Search for research papers on arXiv based on a topic ans store the information\n\n    Args:\n        topic (str): topic to search for in arXiv.\n        max_results (int): Maximum number of results to return.\n\n    Returns:\n        List&#91;dict]: A list of dictionaries containing paper information.\n    \"\"\"\n    client = arxiv.Client()\n\n    search_papers = arxiv.Search(\n        query=topic,\n        max_results=max_results,\n        sort_by=arxiv.SortCriterion.Relevance\n    )\n\n    research_papers = client.results(search_papers)\n\n    #Create a directory for the topic\n    path = os.path.join(RESEARCH_PAPER_DIR, topic.lower().replace(\" \", \"_\"))\n    os.makedirs(path, exist_ok=True)\n\n    # Get the file path\n    file_path = os.path.join(path, \"research_papers_info.json\")\n\n    #Load the files\n    try:\n        with open(file_path, \"r\") as json_file:\n            papers_info = json.load(json_file)\n    except (FileNotFoundError, json.JSONDecodeError):\n        papers_info = {}\n\n    #Process each paper and add the paper information to the dictionary\n    papers_ids = &#91;]\n    for paper in research_papers:\n        papers_ids.append(paper.get_short_id())\n        paper_info = {\n            \"title\": paper.title,\n            \"summary\": paper.summary,\n            \"authors\": &#91;author.name for author in paper.authors],\n            \"summary\": paper.summary,\n            \"published\": paper.published.isoformat(),\n            \"pdf_url\": paper.pdf_url\n        }\n        papers_info&#91;paper.get_short_id()] = paper_info\n\n    # Save the updated information back to the file\n    with open(file_path, \"w\") as json_file:\n        json.dump(papers_info, json_file, indent=2)\n\n    print(f\"Research papers information saved to {file_path}\")\n\n    # Return the list of paper IDs\n    return papers_ids\n<\/code><\/pre>\n\n\n\n<p>The <strong>@mcp.tool()<\/strong> decorator is part of the <strong>FastMCP<\/strong> framework. It is used to <strong>r<\/strong>egister a Python function as a callable <strong>&#8220;tool&#8221;<\/strong> that can be exposed by the MCP server for use by external clients or agents (like an LLM).<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Tool 2: Retrieve Paper Info by ID (<code>get_paper_info<\/code>)<\/h3>\n\n\n\n<p>This tool searches across all locally stored topics and retrieves metadata of a paper using its arXiv ID (e.g., <code>2507.14912v1<\/code>).<\/p>\n\n\n\n<pre class=\"wp-block-code has-very-light-gray-to-cyan-bluish-gray-gradient-background has-background has-small-font-size\"><code>@mcp.tool()\ndef get_paper_info(paper_id: str) -> str:\n    \"\"\"\n    Get information about a specific research paper by its ID accross all topic directories.\n\n    Args:\n        paper_id (str): The ID of the research paper to retrieve information for.\n\n    Returns:\n        JSON object: A JSON object containing the paper information.\n    \"\"\"\n\n    for object in os.listdir(RESEARCH_PAPER_DIR):\n        object_path = os.path.join(RESEARCH_PAPER_DIR, object)\n        if os.path.isdir(object_path):\n            file_path = os.path.join(object_path, \"research_papers_info.json\")\n            if os.path.isfile(file_path):\n                try:\n                    with open(file_path, \"r\") as json_file:\n                        papers_info = json.load(json_file)\n                        if paper_id in papers_info:\n                            return json.dumps(papers_info&#91;paper_id], indent=4)\n                except (FileNotFoundError, json.JSONDecodeError) as e:\n                    print(f\"Error decoding JSON in {file_path}\")\n                    continue\n\n    return f\"Paper with ID {paper_id} not found in any topic directory.\"\n<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Run the MCP Server<\/h3>\n\n\n\n<pre class=\"wp-block-code has-very-light-gray-to-cyan-bluish-gray-gradient-background has-background\"><code>if __name__ == \"__main__\":\n    mcp.run(transport=\"stdio\")<\/code><\/pre>\n\n\n\n<p>We have using the transport protocol as &#8220;stdio&#8221; ie standard input-ouput as we are running the mcp server locally. The client(mcp inspector) will communicate with local mcp server over standard input-ouput<\/p>\n\n\n\n<p>Incase you want to run the mcp server remotely, then transport protocol shall be &#8220;Streamable HTTP&#8221;.<\/p>\n\n\n\n<p>For more details &#8211; <a href=\"https:\/\/modelcontextprotocol.io\/specification\/2025-06-18\/basic\/transports\">https:\/\/modelcontextprotocol.io\/specification\/2025-06-18\/basic\/transports<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Setting up your Environment to test the Server<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Clone the github repository<\/li>\n\n\n\n<li><code>uv init<\/code><\/li>\n\n\n\n<li><code>uv venv<\/code><\/li>\n\n\n\n<li>source .venv\/bin\/activate<\/li>\n\n\n\n<li>uv pip install -r requirments.txt<\/li>\n\n\n\n<li>Launch the MCP inspector (a sandbox environment to test the mcp server without the need of a mcp client)\n<ul class=\"wp-block-list\">\n<li><code>npx @modelcontextprotocol\/inspector uv run research_server.py<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">MCP server testing using MCP Inspector<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">Inspector landing page<\/h4>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"714\" src=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-8-1024x714.png\" alt=\"\" class=\"wp-image-149\" srcset=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-8-1024x714.png 1024w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-8-300x209.png 300w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-8-768x536.png 768w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-8.png 1263w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">List the tools<\/h4>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"701\" src=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-9-1024x701.png\" alt=\"\" class=\"wp-image-150\" srcset=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-9-1024x701.png 1024w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-9-300x205.png 300w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-9-768x526.png 768w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-9.png 1269w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Run the tool &#8211; search_arxiv<\/h4>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"734\" src=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-10-1024x734.png\" alt=\"\" class=\"wp-image-151\" srcset=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-10-1024x734.png 1024w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-10-300x215.png 300w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-10-768x550.png 768w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-10.png 1263w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Run the tool &#8211; get_paper_info<\/h4>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"737\" src=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-7-1024x737.png\" alt=\"\" class=\"wp-image-148\" srcset=\"https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-7-1024x737.png 1024w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-7-300x216.png 300w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-7-768x553.png 768w, https:\/\/aiinfrahub.com\/wp-content\/uploads\/2025\/08\/image-7.png 1267w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>By combining the power of the <strong>arXiv API<\/strong>, <strong>FastMCP<\/strong>, and the modular structure of an <strong>MCP Server<\/strong>, we\u2019ve can quickly build a lightweight yet extensible system to automate research discovery and paper management. With just a few lines of code, our tools can now be invoked by AI agents, UIs, or scripts \u2014 enabling smarter workflows for academics, developers, and research teams.<\/p>\n\n\n\n<p>This is just tip of the iceberg. The MCP architecture opens the door to building tool-rich, agent-driven systems where each function is reusable, inspectable, and callable \u2014 much like a Swiss Army knife for AI-powered automation.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n\n<p><a href=\"https:\/\/github.com\/modelcontextprotocol\/inspector\">MCP Inspector<\/a> <\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/jlowin\/fastmcp\">Fast MCP<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/modelcontextprotocol\/servers\">MCP Servers<\/a><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this blog post, we explore how to build a modular, agent-compatible MCP (Model Context Protocol) server that automates the discovery, storage, and retrieval of research papers from arXiv. By leveraging the FastMCP framework, we expose tool-like interfaces that can be invoked by agents, UIs, or even chat interfaces for smarter academic workflows. What is &#8230; <a title=\"Building an MCP Server Using FastMCP and arXiv\" class=\"read-more\" href=\"https:\/\/aiinfrahub.com\/about-us\/building-an-mcp-server-using-fastmcp-and-arxiv\/\" aria-label=\"Read more about Building an MCP Server Using FastMCP and arXiv\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":155,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[13],"class_list":["post-144","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-agenticai","tag-python-arxiv-aitools-llmagents-fastmcp-automation-researchai-nlp-agentworkflow-knowledgeautomation-developertools"],"_links":{"self":[{"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/posts\/144","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/comments?post=144"}],"version-history":[{"count":17,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/posts\/144\/revisions"}],"predecessor-version":[{"id":166,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/posts\/144\/revisions\/166"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/media\/155"}],"wp:attachment":[{"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/media?parent=144"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/categories?post=144"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aiinfrahub.com\/about-us\/wp-json\/wp\/v2\/tags?post=144"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}