Term Extraction for Simultaneous Interpreters
Term Extraction for Simultaneous Interpreters
Simultaneous interpretation training made easier through term extraction.
- ChallengeTerm extraction from various types of documents for interpreters
- SolutionWeb app for multi-language data extraction for interpreters
- Technologies and toolsPython, Amazon Comprehend, NLP, TypeScript, Django, Celery, Postgres, Redis, RabbitMQ, Selenium
Client
When preparing to work on live events, human interpreters need to get acquainted with relevant terminology, names of people or institutions, etc. Every language has a particular set of words that do not occur so often or that can be more difficult to pronounce. On top of that, another challenge for interpreters is specific terms that can appear during the translation in more technical meetings. Therefore, providing the interpreter with term extraction software as support for those words can improve the preparation of the interpreter and reduce mistakes.
Interprefy is a leading cloud-based Remote Simultaneous Interpretation (RSI) platform that brings remote conference interpreters into your meetings and events anywhere.
Some brief info about the client’s services:
- enables interpreters to work from anywhere, anytime;
- supports dozens of languages;
- works for events and meetings of all shapes and sizes;
- provides integration with different platforms: their own platform and app, Zoom, Webex and so on.
The client wanted to simplify the work of interpreters during the live events with the help of disruptive technologies, specifically, term extraction software. So they approached the InData Labs team with the initial request of developing a solution that will extract terminology and names from text documents provided before an event.
Challenge: term extraction from various types of documents for interpreters
Every language has a particular set of words that may become pitfalls without diligent preparation. They may include:
- rarely used words;
- words with difficult pronunciation;
- industry-related terms.
Ahead of an event, onboarding documents that contain domain-specific terminology, names of delegates and institutions, and other words (that are relevant for the event/domain) are provided.
The InData Labs’ task was to develop data extraction software to extract the relevant terminology, names of people or institutions, etc. to help interpreters in their preparation for the event.
Solution: web app for multi-language data extraction for interpreters
The client asked the InData Labs’ engineers to build an engine that is able to extract the following terms from a text of an arbitrary domain:
PoC stage
In the PoC stage, we’ve implemented the following global tasks:
- Term Extraction Engine for specific terms (words or word forms that rarely occur in the common lexicon).
- Simple UI to demonstrate how it works.
MVP stage
During the app development stage, we’ve done the following tasks:
Back-end development
The InData Labs team has developed the Back-end part for the client, providing them with full-stack development services and saving their time on finding a trusted vendor.
We enabled fast and easy multi-document upload and processing. Besides that, our engineers implemented data scraping that significantly boosted the user experience. Data scraping enabled fast data gathering, compiling, and visualizing of the insights acquired.
Engine improvements
We’ve also integrated 10+ languages for the app users. This functionality provides easy and fast term extraction from these languages, enabling a smooth interpretation process.
Next task for us was to enable language detection functionality while uploading/processing the document. The app processes the document and detects the language automatically, this saves them time and simplifies the user experience significantly.
Front-end development
We’ve implemented the Front-end part of the project ourselves. Our team of engineers has set up app authorization and multi-user mode.
Result: fast and error-free interpretation with data extraction
InData Labs, a data extraction service provider, has developed a robust web application for multi-language data extraction. Using the automatic term extraction software, interpreters can automatically extract specific vocabulary and terminology from the agenda papers and convert them into a readable format. This provides support for the specific vocabulary and improves the preparation of the interpreter for the event and reduces mistakes.
For the interpreters, this enables impeccable interpretation at the events, and for the company – improving the brand positioning on the market and having more contracts from clients.
Project Details
Simultaneous interpretation training made easier through term extraction. Client When […]