{"id":8819,"date":"2021-11-08T11:18:54","date_gmt":"2021-11-08T11:18:54","guid":{"rendered":"http:\/\/TheNextWeb=1372373"},"modified":"2021-11-08T11:18:54","modified_gmt":"2021-11-08T11:18:54","slug":"get-these-python-questions-right-to-ace-your-data-science-job-interview","status":"publish","type":"post","link":"https:\/\/www.londonchiropracter.com\/?p=8819","title":{"rendered":"Get these Python questions right to ace your data science job interview"},"content":{"rendered":"\n<div><img decoding=\"async\" src=\"https:\/\/img-cdn.tnwcdn.com\/image\/tnw?filter_last=1&amp;fit=1280%2C640&amp;url=https%3A%2F%2Fcdn0.tnwcdn.com%2Fwp-content%2Fblogs.dir%2F1%2Ffiles%2F2021%2F11%2FUntitled-design-38.jpg&amp;signature=cf88622cf7ccb47fd362f23595d961e3\" class=\"ff-og-image-inserted\"><\/div>\n<p>If you want to have a career in data science, knowing Python is a must. Python is the most popular programming language in data science, especially when it comes to machine learning and artificial intelligence.<\/p>\n<p>To help you in your data science career, I\u2019ve prepared the main Python concepts tested in the data science interview. Later on, I will discuss two main interview question types that cover those concepts you\u2019re required to know as a data scientist. I\u2019ll also show you several example questions and give you solutions to push you in the right direction.<\/p>\n<h2>Technical Concepts of Python Interview Questions<\/h2>\n<p>This guide is not company-specific. So if you have some data science interviews lined up, I strongly advise you to use this guide as a starting point of what might come up in the interview. Additionally, you should also try to find some company-specific questions and try to solve them too. Knowing general concepts and practicing them on real-life questions is a winning combination.<\/p>\n<p>I\u2019ll not bother you with theoretical questions. They can come up in the interview, but they too cover the technical concepts found in the coding questions. After all, if you know how to use the concepts I\u2019ll be talking about, you probably know to explain them too.<\/p>\n<p>Technical Python concepts tested in the data science job interviews are:<\/p>\n<p>Data types<\/p>\n<p>Built-in data structures<\/p>\n<p>User-defined data structures<\/p>\n<p>Built-in functions<\/p>\n<p>Loops and conditionals<\/p>\n<p>External libraries (Pandas)<\/p>\n<h3>1. Data Types<\/h3>\n<p>Data types are the concept you should be familiar with. This means you should know the most commonly used data types in Python, the difference between them, when and how to use them. Those are data-types such as integers (int), floats (float), complex (complex), strings (str), booleans (bool), null values (None).<\/p>\n<h3>2. Built-in Data Structures<\/h3>\n<p>These are list, dictionary, tuple, and sets. Knowing these four built-in data structures will help you organize and store data in a way that will allow easier access and modifications.<\/p>\n<h3>3. User-defined Data Structures<\/h3>\n<p>On top of using the built-in data structures, you should also be able to define and use some of the user-defined data structures. These are arrays, stack, queue, trees, linked lists, graphs, HashMaps.<\/p>\n<h3>4. Built-in Functions<\/h3>\n<p>Python has over 60 built-in functions. You don\u2019t need to know them all while, of course, it\u2019s better to know as many as possible. The built-in functions you can\u2019t avoid are abs(), isinstance(), len(), list(), min(), max(), pow(), range(), round(), split(), sorted(), type().<\/p>\n<h3>5. Loops and Conditionals<\/h3>\n<p>Loops are used in repetitive tasks when they perform one piece of code over and over again. They do that until the conditionals (true\/false tests) tell them to stop.<\/p>\n<h3>6. External Libraries (Pandas)<\/h3>\n<p>While there are several external libraries used, Pandas is probably the most popular. It is designed for practical data analysis in finance, social sciences, statistics, and engineering.<\/p>\n<h2>Python Interview Types of Questions<\/h2>\n<p>All those six technical concepts are mainly tested by only two types of interview questions. Those are:<\/p>\n<p>Data manipulation and analysis<\/p>\n<p>Algorithms<\/p>\n<p>Let\u2019s have a closer look at each of them.<\/p>\n<h3>1. Data Manipulation and Analysis<\/h3>\n<p>These questions are designed to test the above technical concept by solving the ETL (extracting, transforming, and loading data) problems and performing some data analysis.<\/p>\n<p>Here\u2019s one such <a href=\"https:\/\/platform.stratascratch.com\/coding-question?id=10291&amp;python=1\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">example from Facebook<\/a>:<\/p>\n<p><strong><em>QUESTION:<\/em><\/strong> Facebook sends SMS texts when users attempt to 2FA (2-factor authenticate) into the platform to log in. In order to successfully 2FA they must confirm they received the SMS text message. Confirmation texts are only valid on the date they were sent. Unfortunately, there was an ETL problem with the database where friend requests and invalid confirmation records were inserted into the logs, which are stored in the \u2018fb_sms_sends\u2019 table. These message types should not be in the table. Fortunately, the \u2018fb_confirmers\u2019 table contains valid confirmation records so you can use this table to identify SMS text messages that were confirmed by the user.<\/p>\n<p>Calculate the percentage of confirmed SMS texts for August 4, 2020.<\/p>\n<p><strong>ANSWER:<\/strong><\/p>\n<pre>import pandas as pd import numpy as np df = fb_sms_sends[[\"ds\",\"type\",\"phone_number\"]] df1 = df[df[\"type\"].isin(['confirmation','friend_request']) == False] df1_grouped = df1.groupby('ds')['phone_number'].count().reset_index(name='count') df1_grouped_0804 = df1_grouped[df1_grouped['ds']=='08-04-2020'] df2 = fb_confirmers[[\"date\",\"phone_number\"]] df3 = pd.merge(df1,df2, how ='left',left_on =[\"phone_number\",\"ds\"], right_on = [\"phone_number\",\"date\"]) df3_grouped = df3.groupby('date')['phone_number'].count().reset_index(name='confirmed_count') df3_grouped_0804 = df3_grouped[df3_grouped['date']=='08-04-2020'] result = (float(df3_grouped_0804['confirmed_count'])\/df1_grouped_0804['count'])*100<\/pre>\n<p>One of the questions asked to test your data analysis skills is <a href=\"https:\/\/platform.stratascratch.com\/coding-question?id=10308&amp;python=1\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">this one from Dropbox<\/a>:<\/p>\n<p><em><strong>QUESTION:<\/strong><\/em> Write a query that calculates the difference between the highest salaries found in the marketing and engineering departments. Output just the difference in salaries.<\/p>\n<p><strong>ANSWER:<\/strong><\/p>\n<pre>import pandas as pd import numpy as np df = pd.merge(db_employee, db_dept, how = 'left',left_on = ['department_id'], right_on=['id']) df1=df[df[\"department\"]=='engineering'] df_eng = df1.groupby('department')['salary'].max().reset_index(name='eng_salary') df2=df[df[\"department\"]=='marketing'] df_mkt = df2.groupby('department')['salary'].max().reset_index(name='mkt_salary') result = pd.DataFrame(df_mkt['mkt_salary'] - df_eng['eng_salary']) result.columns = ['salary_difference'] result<\/pre>\n<h3>2. Algorithms<\/h3>\n<p>When it comes to Python algorithm interview questions, they test your problem-solving using the algorithms. Since algorithms are not limited to only one programming language, these questions test your logic and thinking, as well as coding in Python.<\/p>\n<p>For example, you could get <a href=\"https:\/\/leetcode.com\/problems\/letter-combinations-of-a-phone-number\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">this question<\/a>:<\/p>\n<p><em><strong>QUESTION:<\/strong><\/em> Given a string containing digits from 2-9 inclusive, return all possible letter combinations that the number could represent. Return the answer in any order.<\/p>\n<p>A mapping of digit to letters (just like on the telephone buttons) is given below. Note that 1 does not map to any letters.<\/p>\n<p><strong>ANSWER:<\/strong><\/p>\n<pre>class Solution: def letterCombinations(self, digits: str) -&gt; List[str]: # If the input is empty, immediately return an empty answer array if len(digits) == 0: return [] # Map all the digits to their corresponding letters letters = {\"2\": \"abc\", \"3\": \"def\", \"4\": \"ghi\", \"5\": \"jkl\", \"6\": \"mno\", \"7\": \"pqrs\", \"8\": \"tuv\", \"9\": \"wxyz\"} def backtrack(index, path): # If the path is the same length as digits, we have a complete combination if len(path) == len(digits): combinations.append(\"\".join(path)) return # Backtrack # Get the letters that the current digit maps to, and loop through them possible_letters = letters[digits[index]] for letter in possible_letters: # Add the letter to our current path path.append(letter) # Move on to the next digit backtrack(index + 1, path) # Backtrack by removing the letter before moving onto the next path.pop() # Initiate backtracking with an empty path and starting index of 0 combinations = [] backtrack(0, []) return combinations<\/pre>\n<p>Or it could get even more difficult with <a href=\"https:\/\/leetcode.com\/problems\/sudoku-solver\/solution\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">the following question<\/a>:<\/p>\n<p><em><strong>QUESTION:<\/strong><\/em> \u201cWrite a program to solve a Sudoku puzzle by filling the empty cells. A sudoku solution must satisfy all of the following rules:<\/p>\n<p>Each of the digits 1-9 must occur exactly once in each row.<\/p>\n<p>Each of the digits 1-9 must occur exactly once in each column.<\/p>\n<p>Each of the digits 1-9 must occur exactly once in each of the 9 3\u00d73 sub-boxes of the grid.<\/p>\n<p>The \u2018.\u2019 character indicates empty cells.\u201d<\/p>\n<p><strong>ANSWER:<\/strong><\/p>\n<pre>from collections import defaultdict class Solution: def solveSudoku(self, board): \"\"\" :type board: List[List[str]] :rtype: void Do not return anything, modify board in-place instead. \"\"\" def could_place(d, row, col): \"\"\" Check if one could place a number d in (row, col) cell \"\"\" return not (d in rows[row] or d in columns[col] or \\ d in boxes[box_index(row, col)]) def place_number(d, row, col): \"\"\" Place a number d in (row, col) cell \"\"\" rows[row][d] += 1 columns[col][d] += 1 boxes[box_index(row, col)][d] += 1 board[row][col] = str(d) def remove_number(d, row, col): \"\"\" Remove a number which didn't lead to a solution \"\"\" del rows[row][d] del columns[col][d] del boxes[box_index(row, col)][d] board[row][col] = '.' def place_next_numbers(row, col): \"\"\" Call backtrack function in recursion to continue to place numbers till the moment we have a solution \"\"\" # if we're in the last cell # that means we have the solution if col == N - 1 and row == N - 1: nonlocal sudoku_solved sudoku_solved = True #if not yet else: # if we're in the end of the row # go to the next row if col == N - 1: backtrack(row + 1, 0) # go to the next column else: backtrack(row, col + 1) def backtrack(row = 0, col = 0): \"\"\" Backtracking \"\"\" # if the cell is empty if board[row][col] == '.': # iterate over all numbers from 1 to 9 for d in range(1, 10): if could_place(d, row, col): place_number(d, row, col) place_next_numbers(row, col) # if sudoku is solved, there is no need to backtrack # since the single unique solution is promised if not sudoku_solved: remove_number(d, row, col) else: place_next_numbers(row, col) # box size n = 3 # row size N = n * n # lambda function to compute box index box_index = lambda row, col: (row \/\/ n ) * n + col \/\/ n # init rows, columns and boxes rows = [defaultdict(int) for i in range(N)] columns = [defaultdict(int) for i in range(N)] boxes = [defaultdict(int) for i in range(N)] for i in range(N): for j in range(N): if board[i][j] != '.': d = int(board[i][j]) place_number(d, i, j) sudoku_solved = False backtrack()<\/pre>\n<p>This would be quite a complex algorithm and good for you if you knew how to solve it!<\/p>\n<h2>Conclusion<\/h2>\n<p>For a data science interview, the six technical concepts I\u2019ve mentioned are a must. Of course, it\u2019s recommended you dive even deeper into Python and broaden your knowledge. Not only theoretically but also practicing by solving as many as possible both data manipulation and analysis and algorithm questions.<\/p>\n<p>For the first one, there are plenty of examples on <a href=\"https:\/\/www.stratascratch.com\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">StrataScratch<\/a>. You could probably find the questions from the company where you applied for a job. And <a href=\"https:\/\/leetcode.com\/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">LeetCode<\/a> is a good choice when you decide to practice writing Python algorithms before your interviews.<\/p>\n<p> <a href=\"https:\/\/thenextweb.com\/news\/python-questions-data-science-job-interview-syndication\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you want to have a career in data science, knowing Python is a must. Python is the most popular programming language in data science, especially when it comes to machine learning&#8230;<\/p>\n","protected":false},"author":1,"featured_media":8820,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts\/8819"}],"collection":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=8819"}],"version-history":[{"count":0,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/posts\/8819\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=\/wp\/v2\/media\/8820"}],"wp:attachment":[{"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=8819"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=8819"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.londonchiropracter.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=8819"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}