The proportion of documents in a population that belong to a given category. For example, in the context of production, suppose that 1 million documents have been collected, of which 100,000 are actually relevant and must be produced. Prevalence would be 10% (100,000 / 1,000,000). The lower the prevalence, the larger the training set generally must be.