Serve a MCP Server in Google Cloud Run

Agentic AI hype is in full motion and the number of published MCP (Model Context Protocol) servers is exploding. Some years back, we joked about having a mobile app available for any purpose you can think of – today its the same with having MCP servers available for any use-case.

Given the popularity that MCP servers and building AI agents experiences, I wanted to look into how hard it would be to build the most minimalistic ‘hello world MCP server’ and deploy it as a remote MCP server with Google Cloud Run.

Google Cloud Run represents a cost-effective way of providing dockerized, serverless functionality without the need of paying expensive servers.

Within this blog post I will show how to build and deploy your own MCP (Model Context Protocol) server as a Google Cloud Run service. The post will also show how you can secure your MCP server by authenticating MCP clients by using GCP IAM account users.

But lets start at the beginning, by writing our minimalistic MCP hello world server.

The Hello World MCP server

The Hello World MCP server is one of the most simplistic MCP servers possible, which just offers a stateless tool without any parameters that greets the caller with returning the string ‘Hello world’.

It comes with a Dockerfile and a Python requirements.txt file for directly building the docker container that can be deployed into a container runtime such as Google Cloud Run.

To enable MCP clients to call the server from remote, a streamable http transport is offered.

Within this example, I use FastMCP to create and test my MCP server without the need to use a non-deterministic LLM AI model. FastMCP is a great framework to build and test your own MCP servers.

I installed FastMCP by calling ‘python -m pip install fastmcp‘, then importing it into a minimalistic Python server that defines a single, parameterless MCP tool. Running the MCP server, I selected the transport ‘streamable-http‘ to make my MCP server callable from Remote, instead of running it locally with ‘stdio’ transport.

from fastmcp import FastMCP
import os

mcp = FastMCP(
    name= "Hello World Server",
    version="0.1",
    website_url="https://www.smartlab.at")

@mcp.tool
def hello_world()-> str:
    """Returns hello world."""
    return f"Hello world!"

if __name__ == "__main__":
    mcp.run(transport="streamable-http", host="0.0.0.0", port=os.getenv("PORT", 8080))

Now you can run your MCP server by just starting it with python server.py.

Testing your Remote MCP Server

You might already encountered the stubborn nature of AI LLM models. Sometimes they return the expected result, sometimes they stubbornly return a different result or reject to call your MCP tool no matter how you try to push them.

Thats the reason why I love to use FastMCP to deterministically test my MCP tools without the need of involving any costly and stubbborn AI model.

With FastMCP calling your MCP tools, MCP rather acts more like a standard API then an AI framework, which helps us during the testing process.

See the FastMCP Python code below, to test your streamable-http remote MCP server without the need for an Agentic AI client:

from dotenv import load_dotenv
import os
import asyncio
from fastmcp import Client
load_dotenv()
import json

async def test_mcp_server():
# Test the MCP server using streamable-http transport.
config = {
    "mcpServers": {
        "hello-mcp": {
            # Remote HTTP/SSE server
            "transport": "http",  # or "sse" 
            "url": '127.0.0.1/mcp'
        }
    }
}

client = Client(config)

async with client:
    # List available tools
    tools = await client.list_tools()
    for tool in tools:
        print(f">>> Tool: {tool.name}")
        print(f">>>    {tool.description}")
        print(f">>>    {tool.inputSchema}")

    # Call tool
    print(">>> Calling tool 'hello_world'")
    result = await client.call_tool("hello_world", {})
    print(f"<<< Result: {result.content[0].text}")

if name == “main“:
asyncio.run(test_mcp_server())

Dockerize your Hello World MCP Server

Now that I have my MCP server up and running locally, I want to wrap it into a Docker container and ultimately ship it through a serverless runtime, such as Google Cloud Run.

Within my GitHub repository, I did create the necessary ‘requirements.txt’ file to list all library dependencies along with providing a Dockerfile for building the Docker container.

See below the ‘requirements.txt’ file:

fastmcp
uvicorn
websockets

and my ‘Dockerfile’:

# Use the official Python lightweight image
FROM python:3.13-slim

# Install uv
COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Install the project into /app
COPY . /app
WORKDIR /app

# Allow statements and log messages to immediately appear in the logs
ENV PYTHONUNBUFFERED=1

# Copy requirements if you have one, else skip this step
COPY requirements.txt ./

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

EXPOSE $PORT

# Run the FastMCP server
CMD ["uv", "run", "server.py"]

Build and Deploy your MCP Server with Google Cloud Build

Now, instead of guing through the cumbersome process of building your Docker container, pushing it to Dockerhub and downloading it into your Google Cloud Console project, I directly use Google Cloud Build for the entire process.

See below how I navigate to Google Cloud Run, click on Deploy GitHub project and then set up ‘Cloud Build’ by connecting my GitHub account to my Google project.

Therefore, Google Cloud Build always starts to build the container when I deploy code to my GitHub repository. This means that I automatically redeploy my MCP server into Google Cloud Run hands free whenever I push new code into the repo.

See below the setup process in Google cloud Console:

After the Cloud Build setup is complete, you have your HelloWorld MCP server up and running as a Google Cloud Run service, as it is shown below. The URL shows the HTTPS address Google assigned to your MCP server, that you can now use in your FastMCP test instead of the local ‘127.0.01’ address.

Create IAM Service Account

I don’t want anybody to access my newly deployed MCP server from public internet without the proper oAuth authorization. Thats why I selected ‘Require authentication’ during the deployment of the MCP server, as the screenshot above showed.

Now we need to navigate to the Services Accounts page and create a new Service account, as shown below:

Assign the role ‘Cloud Run Invoke Service’ to your service account, as it is shown below:

After successful creation of your Service Account user, you need to create a new JSON secret key file that your FastMCP test script will then use to authenticate against your MCP server.

Within your local Python environment, you store the key file outside your project (to not accidentially checkin your secret into GitHub) and set the environment variable ‘GOOGLE_APPLICATION_CREDENTIALS’ to point to your secret key file.

Allow your Service User to Invoke your Service

Your newly created service user has the role of invoking a Google Cloud Run service, but is does not have access to your cloud ressource yet. Therefore, I headover to my Google Cloud Service, select the service (make sure you don’t click it, bust just select it in the list of services) and assign my IAM service user execution rights for my service, as it is shown below:

Use oAuth Flow to Authenticate GCP MCP Service

Now, everything correctly set up and running, I can modify my local FastMCP test to use my IAM service account to authenticate against the MCP server deployed in Google Cloud run, as it is shown below:

from dotenv import load_dotenv
import os
import asyncio

from fastmcp import Client

import google.auth.transport.requests
import google.oauth2.id_token

# Load .env file
load_dotenv()

import json


async def test_mcp_server():   
    # Test the MCP server using streamable-http transport.

    service_url = 'https://mcp-hello-world-23234234234.europe-west1.run.app/mcp'

    # Prepare OIDC token
    auth_req = google.auth.transport.requests.Request()
    id_token = google.oauth2.id_token.fetch_id_token(auth_req, service_url)

    config = {
        "mcpServers": {
            "hello-mcp": {
                # Remote HTTP/SSE server
                "transport": "http",  # or "sse" 
                "url": 'https://mcp-hello-world-23234234234.europe-west1.run.app/mcp',
                "headers": {
                    "Authorization": f"Bearer {id_token}"
                }
            }
        }
    }

    client = Client(config)

    async with client:
        # List available tools
        tools = await client.list_tools()
        for tool in tools:
            print(f">>> Tool: {tool.name}")
            print(f">>>    {tool.description}")
            print(f">>>    {tool.inputSchema}")
                
        # Call tool
        print(">>> Calling tool 'hello_world'")
        result = await client.call_tool("hello_world", {})
        print(f"<<< Result: {result.content[0].text}")
        
if __name__ == "__main__":
    asyncio.run(test_mcp_server())

I run the authentication flow and receive the following FastMCP result from my Remote MCP server hosted in Google Cloud run:

Summary

Hosting your own remote MCP server within Google Cloud Run represents a cost effective way of exposing specific tools to the Agentic AI world. Read more about AI Agents and Agentic AI in the book ‘Agentic AI’. Safeguarding those with oAuth service users is essential to avoid misuse of your tools and services and to provide the necessary level of security. Using FastMCP as building and testing framework for building your own MCP tools is great way to start and to deliver fast and reliably.

Now I am ready to add my own MCP tools to those growing MCP marketplaces that are popping out everywhere at the moment.

Happy building your own MCP server in Google Cloud Run.