Blog

Explore the blog

Browse practical insights on transcription, captioning, and speech data — with the newest posts first.

Showing 12 posts on this page (527 total)

Building Secure, Inclusive, and Effective Speaker Verification Systems featured image

Building Secure, Inclusive, and Effective Speaker Verification Systems

By Way With Words Team

How Does Speaker Verification Rely on Speech Corpora? Building Secure, Inclusive, and Effective Speaker Verification Systems The sound of a voice is becomi...

Read article
Why Is Paralinguistic Speech Data Crucial in Emotion Detection? featured image

Why Is Paralinguistic Speech Data Crucial in Emotion Detection?

By Way With Words Team

As research continues and multilingual, real-world datasets expand, the potential of paralinguistic speech data will only grow.

Read article
Clinical Speech Data: The Voice of the Future in Medicine featured image

Clinical Speech Data: The Voice of the Future in Medicine

By Way With Words Team

From voice biomarkers, to automated transcription systems that free clinicians from paperwork, clinical speech data is unlocking new frontiers in diagnosis, monitoring, and patient care.

Read article
Challenge of Training Language Identification Speech Systems featured image

Challenge of Training Language Identification Speech Systems

By Way With Words Team

This article explores what speech data is used for language identification, the challenges of training such systems, and the industries that depend on them.

Read article
Use of Contextual Speech Corpora to Benefit Virtual Assistants featured image

Use of Contextual Speech Corpora to Benefit Virtual Assistants

By Way With Words Team

How Do Virtual Assistants Benefit from Contextual Speech Corpora? How to Create Virtual Assistants That Feel Truly Intelligent Voice assistants have moved...

Read article
Training Chatbots: The Critical Role of Speech Data featured image

Training Chatbots: The Critical Role of Speech Data

By Way With Words Team

Chatbots and voice assistants are woven into the fabric of daily life, from guiding us through customer service queries to helping us control smart devices with simple spoken commands.

Read article
Importance of Labelling Non-Verbal Events in Speech Data featured image

Importance of Labelling Non-Verbal Events in Speech Data

By Way With Words Team

Non-verbal audio events carry layers of meaning and labelling them properly is therefore a foundational task in modern speech data annotation.

Read article
How Do You Prevent Overfitting in Speech Dataset Design? featured image

How Do You Prevent Overfitting in Speech Dataset Design?

By Way With Words Team

One of the most persistent challenges for speech model developers and data scientists is preventing overfitting in speech data.

Read article
Audio Recording in the Field: Follow Proven Best Practices featured image

Audio Recording in the Field: Follow Proven Best Practices

By Way With Words Team

This article explores the key areas of field audio recording, from pre-recording planning and equipment selection to managing conditions, ensuring data safety, and respecting ethics.

Read article
Can Open-Source Tools Reliably Collect Quality Audio? featured image

Can Open-Source Tools Reliably Collect Quality Audio?

By Way With Words Team

This article explores the strengths and weaknesses of open-source tools, and evaluates their performance across different requirements.

Read article
Designing an Effective Semi-supervised Speech Data Pipeline featured image

Designing an Effective Semi-supervised Speech Data Pipeline

By Way With Words Team

In a semi-supervised speech data setup, a portion of the dataset is labelled by humans, while a much larger portion remains unlabelled.

Read article
How Do You Anonymise Voice Data Samples? featured image

How Do You Anonymise Voice Data Samples?

By Way With Words Team

To properly anonymise voice data, various categories must be considered including speaker identity, spoken content, contextual audio clues, and vocal biometrics.

Read article