Training Data

Training Data, AKA Training Documents, Tagging, Tagging Data, Seed Set:
Documents reviewed and categorized (“tagged”), usually by attorneys, according to whether they belong to various categories such as relevance, privilege, legal issues, overall “hotness” and the like. Software uses the training data to build a model that predictively codes the documents.

[1] See, e.g., Da Silva Moore v. Publicis Groupe, No. 11-cv-01279, 2012 WL 607412, at *11 (S.D.N.Y. Feb. 24, 2012).