About This Project

This project provides tools and resources related to the llmstxt.org initiative, which aims to standardize interactions with Large Language Models. You can learn more at the main project site: https://llmstxt.org/.

Our Data Collection Process

To provide comprehensive insights, we regularly crawl the top 1 million websites (updated every few days) to find `llms.txt` and `llms-full.txt` files. The collected data is then enriched with additional metrics, including website category and a quality score.

The quality score is derived by validating the discovered `llms.txt` and `llms-full.txt` files against the official specification using the llms_text module available from llmstxt.org.

About the llmstxt Standard

The `llmstxt` standard aims to provide a structured way for website owners to communicate instructions and permissions to Large Language Models (LLMs) and other AI agents interacting with their sites. It complements existing standards like `robots.txt` (which focuses on crawlers) and `sitemap.xml` (which aids discovery).

An `llms.txt` file is a simple plain text file, typically located in the root directory of a website, containing directives in a key-value format. These directives can specify preferred interaction models, data usage policies, contact information for AI-related inquiries, and more. The goal is to foster clearer communication and responsible AI interactions online.

Key Features

Technology Stack

Built with Astro, Tailwind CSS.