Audiovisual (AV) materials are predominant historical and scientific records of our times, and their numbers are increasing exponentially in collecting institutions. Tasked with preserving and making AV materials available, libraries, archives, and museums (LAMs), need to find efficient and scalable curation solutions. Using machine learning (ML) to generate metadata is promising, but to adopt such methods information professionals must overcome a host of technological and cultural challenges. We introduce the AI4AV project in which we are conducting research around the design and evaluation of a system (currently a prototype) that uses ML to translate audio to text as well as natural language processing to classify and describe AV materials within open computing infrastructure that can be shared by multiple LAMs. This presentation describes the testbed collection, the ML and NLP methods and computing resources, and the protocol to incorporate LAMs values in the design and evaluation of the system.