Methodology
This page documents how LLMYourWay collects and presents model information. Our goal is to be accurate and transparent.
Data Sources
- Hugging Face: Public Models API for model metadata: id, author, tags, pipeline_tag, downloads, likes, and card_data (description, license). We do not modify these fields.
- GitHub (planned): Repository search and release monitoring for open model code and releases.
Fields and Definitions
- Downloads: Total downloads reported by Hugging Face for a model repository.
- Likes: Community likes reported by Hugging Face.
- Pipeline: The model’s primary task label on Hugging Face (e.g., text-generation).
- License: As provided by the upstream source (model card or repository).
- Parameters: Only shown when provided by the source; otherwise omitted.
- Context Window: Maximum input tokens accepted by a model (shown when reliably available).
Popularity and Recency
Listings prioritize models by downloads, then recent synchronization time, then stars/likes. This keeps widely used and newly updated models visible while avoiding speculation.
Performance Benchmarks
We do not infer or fabricate benchmarks. When standardized results are available (e.g., from public leaderboards or vendor reports), they are attributed and linked. The database contains a Benchmarks table to store source, date, and details; if none exist for a model, the UI indicates that benchmark data is not available.
Pricing
Pricing is only displayed when the provider publishes clear input/output token rates and units. Open-source models without a hosted API are shown with “—” for pricing. We avoid converting between units unless the provider’s unit is unambiguous.
Update Cadence
Ingestion runs nightly in free mode for Hugging Face and will increase coverage gradually via a gentle prefix crawl. The models page revalidates every 60 seconds so new entries appear shortly after ingest.
Limitations
- Hugging Face rate limits may delay ingestion; missing fields reflect upstream data.
- Some information (parameters, context) is not consistently available and is shown only when sourced.