Salary exploration with Python and Payscale data
From web scraping with Selenium to data visualization with Dash
The idea is to automatically search on payscale.com the salary for a list of jobs in different countries and visualize the results with a Dash web application.
By making a search on the website for the salary of a Data Engineer in Italy:
We can see that the url is https://www.payscale.com/research/IT/Job=Data_Engineer/Salary
. So it is possible to automatically search the website by providing a country code and a job name.
To automate the web search we can use Selenium, a Python library that allows us to do actions in the browser. Primarily it is used for automating web applications for testing.
Every salary exploration is identified by the YYYYMMDDTHHMMSSZ
date string.
After searching for each country and each job we have to convert the salary from local currency to a common one (i.e. Euro in my case). To do that we can again use Selenium and perform a search on Google to get the conversion from any currency to Euro.
For example in United States a Data Engineer earns $93245 so we can search on Google: $93245 to EUR
We will get a page like this:
So we simply need to read the converted value.
Finally to visualize the data we can build a simple dash web application:
Using the drop down menu it is possible to select witch extraction you want to visualize.
Thanks payscale.com for the data. Payscale offers a wide variety of products and services in the compensation technology. Check it out.
Just to be clear I do not have any commercial affiliation with Payscale.
Outro
I hope the story was interesting and thank you for taking the time to read it. The code for this project can be found in this github repository and on my Blogspot you can find the same post in Italian. Let me know if you have any question and if you like the content that I create feel free to buy me a coffee.