The IT job market can be quite complicated; role names and job titles are often not descriptive enough. When a company is looking for a “Java Developer”, do they want someone to work on the front-end, or the back-end? Does this person need to have leadership or architecting skills? Which Java frameworks do they have to know? We don’t know if the specialist is suitable for the job until we actually compare their CV with the job description. And that takes a lot of time. But maybe it doesn’t have to…
Our client takes a holistic view of the specialist being considered because it’s not just a role name and years of experience that describe a person on the job market. To have a full perspective, we need to consider things like; education, responsibilities, courses taken, and, most importantly, hard and soft skills. It is the skills that we focused on during the collaboration with Talent Alpha. We were responsible for laying the foundation for using machine learning methods to understand the specialist through their skill set.
The second part of the challenge was a bit easier to define. For a platform like Talent Alpha to succeed, it’s crucial to gather a large set of registered specialists (called a talent pool). Only when the number of users is high enough, is it possible to find a candidate who perfectly matches the needs.
The problem is that people are usually not very keen on spending a lot of time filling forms, especially on a new platform, when they are not sure of the benefits they could get from it.
At the same time, we require a lot of data about specialists to be able to match them well. We need to make sure the registration process is as quick and straightforward as it can be while gathering as much information as possible.
The Stermedia team armed Talent Alpha with a way of using word vectorization to describe skills. We connected around 3000 skills from a specialists database into a graph.
This helped us automatically discover the relationships between skills:
It also gave us a nice, configurable way of visualizing those links. We used the connections between skills to create the first version of a skills taxonomy. Taxonomies are used to look at a specialist from different angles. This allows us to see the same skillset differently, depending on the role we are recruiting for.
Last but not least, the skills graph allowed us to apply machine learning methods to assign a numerical vector to each skill. This way we could measure distances between skills and describe specialists’ skill sets. These two things combined facilitate candidate recommendations.
Skills graph and skill vectors opened many opportunities for new features on the platform. Using them, we created a demo that proposes the roles most suitable for a specialist with a given skill set. In addition, it could also propose skills a specialist could learn next, and point out important skills that are missing from their portfolio.
We also made a prototype mechanism for finding the best candidates for a given job offer using vector computations.
To remedy the issue with gathering new platform users, Stermedia trained a named entity recognition neural model (NER) for automatically pulling out information from CV documents in .docx and .pdf formats. Information extracted included skills, role names, job responsibilities, education, certificates, and more. This way a specialist could just upload an existing CV and take advantage of auto-completion instead of filling in all information manually. This significantly accelerates the registration process.
To train a neural network for parsing CVs, we needed to prepare a custom dataset. We hired an internal team of annotators to manually label 1500 documents with Labelbox. Besides training a model, we used Google Vision API to transform .pdf documents to text format suitable for the named entity recognition model.
Another way we helped in enhancing the platform onboarding, was the Instant Onboarding PoC. We tried to capture the rough direction of a specialist’s career path by asking them a few short questions. This would give them an optional quick starting point on the platform, without the need of uploading a CV or filling in forms.
We conducted interviews and polls among our team members, both in Stermedia and Talent Alpha, to make the result of this process as close to real-life as possible.
Stermedia helped Talent Alpha in the introduction of ML-powered components to the platform. We delivered several PoCs, as well as one production-ready solution – CV parser. We provided a mathematical way of looking at specialists’ skill sets.
Our contributions include, but are not limited to:
While working with Talent Alpha, we took a product-oriented approach while trying to understand the client’s needs and answer them. We believe technology is a tool for solving actual problems, not for showing off. . Working closely together during all phases of the project, from conceptualization to deployment, made the collaboration very fruitful. We hope for more projects as interesting as this one in the future.
Python, Jupyter, Streamlit, spaCy, Gensim, node2vec, Docker, Labelbox, GCP, Google App Engine, Google Cloud Run, Google Vision API
Talent Alpha is a SaaS – HRtech platform. This platform is an AI-driven digital product that lets organizations around the globe measure and manage Tech Talent by creating a digital representation of their talent genome. It also allows them to easily scale up and down their IT workforce using the Human Cloud. The platform is aimed at managers running internal and external recruitment processes, planning reskilling and upskilling initiatives, or managing diverse teams. It is also well suited to individual specialists eager to take the next step on their career path.
Are you inspired?
Let’s talk about your idea.