Our product team could not be more excited to introduce our new piece of software, which we are now implementing across all platforms in the UK, the USA and France. We wanted to give you an insight into the behind the scenes in this interview with Jacek, our Lead Machine Learning Engineer and the brains behind the project. This gives you the chance to understand exactly what the team did, the processes they went through, and what problems it will help you solve. So, let's begin.
What is Machine Learning?
Jacek: A Machine learning model is a piece of software that is taught to find specific patterns in data. Whenever this software sees a specific piece of data, it learns to recognise what kind of data it is seeing.
Let’s give you an example. We want to train our hypothetical machine learning software to recognise an image of a dog. We then input the visual characteristics of a dog to train the machine to recognise what the characteristics of a dog are. Now, when the software is presented with a picture of a dog, it will recognise the characteristics and place it under the label “dog”.
How do you train a machine learning model?
Jacek: It learns to recognise certain characteristics, words or numbers and then put the selected data or image under a specific label. Effectively, you give the machine learning software a bunch of data, and for every element in that data set, you give it a label to explain what that data represents.
So, how does our product now use machine learning?
Jacek: I trained my machine learning model to categorise different pieces of data. Every time a question is submitted to our platform, the incoming question gets sent to our machine learning model. The model then recognises which topic or category that question belongs to based on the data included in the question. This means that all incoming questions are categorised into different labels, allowing us to better organise and understand our content.
Why did we implement Machine Learning into our product?
Jacek: Implementing Machine Learning allows us to improve the quality of our research, and improve the quality of the content on our platforms. The product update allows us to categorise the information much better, meaning that both employers and candidates are able to find and analyse the content easier.
How much time did it take?
Jacek: A lot of time - it was my first project. It took 6 months for me to create the minimum viable product (MVP). The MVP is a minimum state where the model works and can be implemented, but it still needs to be improved. Then, it was another 3 months to integrate the machine learning software into our platform.
What was your process when creating the model and how did you improve its accuracy?
Jacek: There are different stages to a machine learning project. The first stage, the most time consuming (and boring) stage, is the creation of a training data set. I had to take questions from our database and manually read all of them, assigning a topic to each one: I spoke with knowledge domain experts to discuss topics and content from our platform to help me create the right labels. I manually labelled 10,000 questions, which is why the process took a lot of time. I then had to run the results by the knowledge domain experts, to ensure that the labels I had associated with the questions were correct. Once this was completed, I could then start training the model. This preparation step took up the first two months.
The next step is the training of the model, where you train different types of models to find out which type is best suited to your problem. Once you have decided on the model to use, you are able to train the model and fine-tune it a little bit. You need to get it to the point where the performance of that model is good. For instance, when fine tuning the model you may find that some of the topics need to be merged or separated. It gives you the chance to recognise if the model is confused between a few different topics, and allows you to make those changes. Once you have completed all necessary fine-tuning, you then have the MVP and can begin integrating the product.
The integration of the product took three months as it required our whole tech team to focus their efforts on integrating the machine learning software into all elements of our product. You have to think about all of the different places where this model will have impact.
How are we using the data we collected?
Jacek: We used our Machine Learning Model to categorise 20,116 questions and 2.9 million views, from 2016 to 2019, into topics. This has allowed us to understand the most popular questions our candidates' have been asking. Our client account team are now sharing this valuable data with our clients to show them what candidates have been interested in, on their platform, and in their specific sector. This allows them to build their content strategy on-top-of the content generated on their individual platforms and through Live Chats. If you would like to have a look at what candidates are interested in your sector, click the image below:
What problem does this solve for HR teams and Candidates?
Jacek: The problem we solved is understanding the content that is generated by candidates on the platforms. The software allows us to leverage this content, in a more meaningful way. Employers can find out what questions are trending and most relevant for candidates in real time, allowing them to gauge what’s going on on their platform, and what’s on their candidates’ minds.
Let's find out a bit more about the human behind the machine learning update, Jacek:
What was your experience in Machine Learning previous to PathMotion?
Jacek: I have broad experience in tech start-ups. Previously, I’ve created a chatbot for one company, and various types of models. I’ve also made a content categorisation model which I applied to articles within food, travel and venues. I’ve created data analysis for user segmentation and content understanding. I have experience in leading projects from planning to organisation, and I have also managed a small group of data scientists.
How did you get into this career?
Jacek: I studied Computer Science as my Bachelor degree, and Machine Learning as my masters. My education gave me the foundation needed to find my first job as a data scientist in London.
Something interesting about yourself?
Jacek: I have a motorbike and I love taking it out for rides. I have the Kawasaki ER-6N. The best journey I have been on was to Finchingfield, in North East London: the 2 ½ hour ride to the town was beautiful with a lot of trees and stunning landscapes, and they have a windmill. On a different note I'm quite the nerd, I love watching films, playing games and reading fantasy books.