Agents

Browser-Use Is The Open Source Agent We Needed

All articles
🤖🦾

Playwright + LLM = The Browser Agent That Actually Ships

Every six months a startup raises 50 million dollars to make an "AI that uses the browser." Most of them are slow, gated, and hilariously expensive. Meanwhile, a small Python library called browser-use does the same job, runs locally, and costs whatever your model API costs. Spoiler: it's better than the commercial options.

The Setup

It's a Playwright wrapper with a smart DOM extractor and a tight agent loop. The model sees an accessibility-tree representation of the page, picks an action, executes, repeats. No bespoke screen-pixel models, no proprietary backend.

{`pip install browser-use
playwright install chromium`}

The Money Pattern

One Agent, one task string, one run call. The agent navigates, clicks, types, scrolls, and reports back. I use it for filling out compliance forms and scraping Google Ads UI bits the API doesn't expose.

{`import asyncio
from browser_use import Agent, Browser, BrowserConfig
from langchain_anthropic import ChatAnthropic

async def main():
    browser = Browser(config=BrowserConfig(headless=False))
    agent = Agent(
        task=(
            "Open Pipedrive, find deals stuck in 'Awaiting Inspection' "
            "for more than 14 days, and export the list as CSV."
        ),
        llm=ChatAnthropic(model="claude-opus-4-5"),
        browser=browser,
        max_steps=25,
    )
    result = await agent.run()
    print(result.final_result())
    await browser.close()

asyncio.run(main())`}

The Catch

SPAs still trip it up. Anything with aggressive virtualised lists or weird custom focus management will confuse the DOM extractor. Captchas defeat it instantly, as they should — if your agent solves captchas you have bigger problems. And running it without headless=False in dev is a great way to misdiagnose a flaky selector for an hour.

The Verdict

browser-use is the project that quietly made commercial browser-agent startups look overpriced. It's MIT licensed, the maintainers ship weekly, and the code is small enough to fork if you need to. I'm wiring it into a Rebuild Relief operations dashboard so the team can ask "open these three claims in Pipedrive and tag them" in plain English. This is what the agent future actually looks like — small, open, composable. Do not @ me when this kills another funded startup.

Let us make some quick suggestions?
Please provide your full name.
Please provide your phone number.
Please provide a valid phone number.
Please provide your email address.
Please provide a valid email address.
Please provide your brand name or website.
Please provide your brand name or website.