Skip to content

Cannot extract link targets (anymore) #651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JosXa opened this issue Apr 9, 2025 · 3 comments
Open

Cannot extract link targets (anymore) #651

JosXa opened this issue Apr 9, 2025 · 3 comments

Comments

@JosXa
Copy link

JosXa commented Apr 9, 2025

Since the upgrade from 1.13.1 to ^2.0.0, Stagehand does not follow instructions to extract the hrefs of visible anchor tags anymore. Looking at the inference logs, it appears the LLM is only passed the link text, but not the URL anymore, so instructions to grab the URL and start crawling won't work.

Even explicitly annotating the zod model as postUrl: z.string().describe("The relative URL of the href"), has no effect - it just extracts this field as the display name of the link.

The use-case here is to retrieve a list of forum posts and their URLs, without necessarily using stagehand to perform the navigation on those links.

@kamath
Copy link
Member

kamath commented Apr 11, 2025

hey! yeah we changed how extract works to make it much faster, but this meant trimming down the content we give to the LLM. @seanmcguire12 is working on adding links back in #655

@JosXa
Copy link
Author

JosXa commented Apr 12, 2025

Love the speed boost the new version gave, and looking forward to seeing this implemented :)

@seanmcguire12
Copy link
Member

seanmcguire12 commented Apr 17, 2025

hey @JosXa! link extraction is available on the alpha release now if you want to test it out! Within your schema, you'll need to define your link/url field with the following zod type: z.string().url() for it to work. In your case, it would be postUrl: z.string().url()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants