OpenAI, the renowned artificial intelligence research lab, has revealed initial insights from a trial of its latest feature, capable of convincingly reading text in a human-like voice. This marks a substantial leap forward in AI technology, while simultaneously igniting concerns about potential deepfake risks.
The feature, named Voice Engine, is a text-to-speech model currently in a limited-scale preview, shared with approximately 10 developers, according to a spokesperson from OpenAI. Despite earlier briefings to reporters, the company has chosen not to expand the release of Voice Engine at this time.
The decision to limit the release was made after considering feedback from a range of stakeholders including policymakers, industry professionals, educators, and creatives, as mentioned by an OpenAI spokesperson. Initially, the plan was to offer the tool to up to 100 developers through an application process, as highlighted in a previous press briefing.
In a blog post released on Friday, OpenAI acknowledged the significant risks associated with creating speech that closely mimics human voices, particularly in the midst of an election year. “We are collaborating with domestic and international partners spanning government, media, entertainment, education, civil society, and more to ensure that we incorporate their feedback as we progress,” the post stated.
This move by OpenAI underscores the delicate balance between technological advancement and potential misuse, emphasizing the importance of responsible AI development and collaboration with diverse stakeholders. As the company continues to refine Voice Engine, it aims to navigate the complex landscape of AI-generated human-like speech with caution and foresight.