NLP Data Science (Fluent Portuguese)
About the job
The work consists essentially of designing, implementing, and training algorithms that extract information from the clinical text, as well as collaborating in the development and maintenance of the main products of our company. We do not draw a hard line between our research and engineering teams: they both research and develop market-quality software, putting prior research into practice. We mainly work in Python, using our own NLP libraries and other free software libraries.
You must have experience in non-trivial NLP tasks, as well as creating datasets for model training and evaluation (coordinating the annotation process, or having annotated yourself). You should also feel comfortable programming seriously.
We are looking for someone who likes to work in a team, teach and learn from others, and talk about what they are working on, both with members of the technical and non-technical team. In addition, you will be able to prepare scientific communications about the work you have done.
It is important that you feel comfortable helping the rest of the team to get the job done. However, the position is about:
- Coordination of the creation of medical NLP datasets, and training of NLP models. The tasks are, among others:
-Named Entity Recognition
-Named Entity Linking
- Collaborate in the writing of papers and posters.
- Adaptation of models to production environments and different languages.
What we are looking for
- Experience with NLP datasets, whether using pre-existing datasets, correcting them, or creating them from scratch.
- Knowledge of Python and useful libraries for data science (pandas, numpy, sklearn, TensorFlow / PyTorch / Keras.)
- General knowledge of Machine Learning.
- Experience implementing deep learning models for NLP.
- It will be valued:
- Experience annotating datasets or coordinating annotation processes.
- High level of Portuguese.
What we offer
- Full-time permanent contract.
- Gross annual wage: 40.000-45.000€.
- Company profit-sharing scheme.
- Flexible schedule, with the possibility of home office.
- A warm, transparent and supportive team, with a huge emphasis on work-life balance.
- Lunch together in our sunny terrace.
- The opportunity to make your mark in e-health and AI.
IOMED is a technological company of software development. It was launched in 2016, funded by local and international ventures.
We are passionate and talented young professionals, from all around Spain and the world (It couldn’t be any other way, as we’re based in beautiful and bright Barcelona). Our “dream team” is made up of mathematicians, statisticians, bioinformaticians, and physicians.
We are looking for people who are eager to innovate and be part of a project with an impact on the healthcare industry, enjoying what we do, team-work, and taking on new challenges.
IOMED is an equal opportunity employer. We are still a small team and are committed to growing in an inclusive manner. We want to augment our team with talented, dynamic people irrespective of race, color, religion, national origin, sex, physical or mental disability, or age.
What we do
Nowadays, around 50% of Clinical Trials are delayed due to patient recruitment, since patient data collection is performed in a manual fashion. As a result, clinical research is highly inefficient both in time and cost, taking years and billions of dollars to develop a new drug.
This problem could be solved through Real World Data, i.e. derived from Electronic health records (EHR). But unfortunately, up to 85% of existing clinical data is unstructured, i.e. in plain text. This also leads, in part, to the existence of data silos, making it impossible to aggregate data from different hospitals.
IOMED has found the solution to this situation, making it possible to take advantage of the full value of clinical Real World Data. We developed a tool that extracts the necessary data from clinical texts, which results in a structured, standardized, and interoperable database that contains the complete clinical information from hospitals.
By this means, non-reusable information is transformed into data available for Clinical Research, allowing an enormous increase in criteria-compliant patients and a reduction of total time and manual labor devoted to this task.