[Feature Request]: Leverage prompt caching by swapping instructions & content #1699

y0hnn · 2026-01-08T11:36:05Z

y0hnn
Jan 8, 2026

What needs to be done?

We should first give the instructions in the prompt, then give the URL & Content : https://github.com/unclecode/crawl4ai/blob/main/crawl4ai/prompts.py

What problem does this solve?

We could leverage prompt caching. Depending on the provider, the cached tokens for input can ben up to 90% cheaper than normal input tokens. In my use-case, I'm crawling a lot of different pages with the same instructions set.

If we change the prompt to be :

Instructions
HTML Content
URL

We could also leverage on prompt caching for the HTML content which is quite the same (for example common body structure).

Target users/beneficiaries

Everyone using LLM Extraction. It would be cheaper at the end. For example with OpenAI : https://platform.openai.com/docs/pricing

Current alternatives/workarounds

No response

Proposed approach

No response

hafezparast · 2026-03-27T08:32:33Z

hafezparast
Mar 27, 2026
Sponsor

We implemented this — PR #1873.

All 4 extraction prompts in prompts.py are now reordered to put instructions before URL/HTML content, so the instruction prefix gets cached across calls.

If you want to try it before it's merged:

pip install git+https://github.com/hafezparast/crawl4ai.git@fix/prompt-caching-order-1699

To verify caching is working, check your provider's dashboard for cached token counts after running a batch extraction across multiple pages with the same LLMExtractionStrategy. Anthropic shows cache_read_input_tokens in their API response, and OpenAI shows cached_tokens in the usage object.

Would be great to hear if you see the expected cost reduction in practice — especially the actual cache hit rate across a real crawl session.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request]: Leverage prompt caching by swapping instructions & content #1699

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Feature Request]: Leverage prompt caching by swapping instructions & content #1699

Uh oh!

y0hnn Jan 8, 2026

What needs to be done?

What problem does this solve?

Target users/beneficiaries

Current alternatives/workarounds

Proposed approach

Replies: 1 comment

Uh oh!

hafezparast Mar 27, 2026 Sponsor

y0hnn
Jan 8, 2026

hafezparast
Mar 27, 2026
Sponsor